Recently, I was asked a question: how do you create a sustainable knowledge management? The answers that they previously got apparently were on these lines: of different knowledge management systems, the processes, the governance, the artifacts they would produce and the structure of the dedicated team that works on this knowledge management. Mind you, this was not a project on knowledge management, but a program that the client was getting into. They thought that they would knowledge management for that.

There is nothing to blame the answer. Years of management learning taught us that if we want to solve a problem, hire people to do it, allocate resources to it, and show commitment to it. In fact, we talk about how do “organizational change management” (OCM).

Sure enough that there are examples of such cases. Famously, IBM showed how it can create new systems and support it. Microsoft reinvented itself in the internet era – by doing exactly the same, in creating a new focused division.

Yet, in this case, this answer is entirely wrong. Unless the company is in the business of knowledge selling, knowledge management systems will fail. This kind of approach fails when the focus in not meant to be knowledge management. This problem is particular in lot of knowledge sharing applications, collaboration systems, training systems, and so on.

The general explanation that people give is that the designers failed in creating the right programs. That they focused only on computer systems and neglected OCM. Or, that they did not allocate enough money for support.

All of these reasons are correct, yet not really completely correct. There is a reason why lot more money is not allocated – the management does not see the ROI. Or, there is a reason that they neglected OCM – this was not the focus.

So, do we say that these systems are not important? Can we make them important? Or, should we even make them important?

Let us look at this issue in depth. As organizations are trying to become digital, they recognize the importance of these kind of supporting systems and yet do not know how to make them succeed. They work about bringing lot more consultants to change the organizational culture and continue to fail. Therefore, it deserves more in-depth understanding.

Theory of sustainability

In the beginning, in the question I was asked, there was a key word: “sustainable”. We normally see that word in ecology, architecture, and social studies. There, people talk about enduring systems, systems that replenish themselves, systems that do not require external stimulus to support themselves. Normally, in ecology, people talk about the footprint of the systems: the effort it takes to create a system, maintain a system and so on. To make a system sustainable, we want to make sure that it has low footprint, and it does not depend on somebody’s explicit effort to keep it going.

For example, let us look at sustainable architecture: it means the footprint of the architecture is low. It can sustain itself – perhaps through low energy usage and perhaps with solar power. Also, it may recycle water. It may reduce wastage. All of these metrics mean that it does not take much to maintain the architecture. For example, in my village, these thatched houses are built with crossflow ventilation and open spaces. These days, the modern houses are built with bricks with window A/C units. Come summer power cuts, the modern houses are unbearable, whereas the old houses are merely uncomfortable. It is same with urban design also, when the it is harmonious with nature, it is easy to sustain. Most cities cannot sustain themselves without lot of effort these days into the roads, parking lots, public transportation, sanitation, and water services. Imagine that we build cities for people! We may be able to make them sustainable.

[See the contrast between two designs: the first one relies on parking spaces – it needs more and more parking space as it needs more and more space and only cars can traverse the distance. The second one is built for people, where walking is the way to get around. It works because it needs less space. Pictures courtesy: http://www.andrewalexanderprice.com/]

Surely, we can take lessons from these systems in designing our knowledge management system? Designing for people with low maintenance or no maintenance? Or, working with the way people do, instead of trying to change them? Working with the existing culture instead of imposing a new culture? Making “nudge” like social experiments instead of, say great leap forward?

These are hard questions, with no definite answers. Yet, we can make an attempt at understanding the role of sustainable systems in IT and key elements to creating such systems.

Three kinds of IT systems

Let us say that a bank is getting into new business of serving mass affluence customers. Historically, they served only the customers with 10 million dollars or more. Now, they want to extend the same services to people with 250K. Let us say that the organization is building a new division to support this initiative.

Even if it sounds easy, this is a difficult task. Firstly, the employees are used to handhold the millionaires. They are used to bypass the processes to cater to the super rich. Now, that they are working with regular rich folks, their old ways cannot scale to the new numbers. And, their compensation earlier may depend of different metrics: handling a few HNI. Now, they have to use different metrics – may be total money under management. The systems they use are going to designed for most common used.

Let us confine ourselves to the computer systems (and supporting processes) to run this division:

Let us see what these systems do and where sustainability may make a big difference.

Core systems

Without these core systems, the business doesn’t even run. The management makes sure that the focus is on these systems. These systems run with stringent SLAs, with adequate funding, and often with milestones, bonuses, and penalties. These metrics are very closely aligned to business metrics. These systems do not need to be sustainable – there is enough support to sustain them by providing external stimulus.

However, there can be a case to make the architecture of these systems sustainable. Often these systems are rewritten to support changes in business. These rewritings can be expensive and may cause delays in the ability to respond to the markets. If we make the architecture sustainable, then we can make changes to the system without causing major changes to the core part of the architecture. For instance, if we have good modularization, good interfaces, and coherent philosophy of data, we can extend the existing architecture without any changes. For instance, see how Unix endured for the last 50 years, without fundamental changes.

This kind of “built-for-change” architecture can make core systems resilient to change – and can create sustainable architecture.

Required support systems

The required support systems are not part of the core systems. But, to run the core systems we need these systems. The metrics on these systems, the SLAs and how well they are run, do not directly translate to the main core metrics. Yet, they are important, like insurance. For instance, if security is a problem, you would hear about it. If operations is a problem, you would again hear about it. If there are no problems, the chances are you won’t hear about these systems at all.

One interesting aspect is that these systems are very similar across different businesses. For example, would the HR be that different for a bank, from say, a retail company? The chances are that they can get away with using the computer systems that are generic – say software as a service in the cloud. Or, even better yet, they can even outsource most of these functions to outside. Cloud fits in here very well, either as a full end-to-end service offering or a partial service offering.

Since these systems are mandatory, even if there is no direct business value, IT is often measured on these systems. Because of this reason, these systems do not need to be self-sustaining. Still, since these are not core to the business, very rarely do companies want to innovate in these systems.

Optional support systems

These systems are what management talks about in the vision. For example, when a company says that employees are the best assets, these are the systems supposed to increase the value of that asset. Every visionary program talks about how these systems are created and maintained.

And, yet, these are the systems that end up failing often.

Since these are not in the way of core delivery, even if these systems fail, nobody raises alarms. They are like the climate change harbingers. A few polar bears dying? Not a mainstream concern! Yet, we all know that without these systems, modern digital enterprise cannot survive.

Why did I make such a bold statement? Take a look at the models of the modern digital enterprise: Google, Facebook, Netflix and such. Almost all of these place a lot of emphasis on hiring, creating, and nurturing talent. They take these optional systems and turn them to competitive assets so much so that most companies that want to be digital, want to emulate them on these aspects. The whole idea of innovation labs or devops, or 2-speed IT, or data driven organizations take these practices as a way of creating such dynamic organizations.

For an organization that wants to see these optional support systems as a core strength need not look at digital enterprises alone. They can look at open source ecosystem, where these optional systems are well developed and sustained. For instance, in open source world, if a volunteer joins a new software development effort, how does he go about it?

He can read the code and documentation on github.

He can read the old notes or discussions on mailing list archives or google groups.

He can participate on the slack channels.

He can fork the code

He can fix some simple issues and contribute back

He can enhance the documentation.

In short, without taxing any resources, he can become a productive member of the team. In fact, these are the salient features of this development process that makes it sustainable:

All the optional systems are baked into the required systems: the knowledge management is a part of version control system.

The optional systems are simple. For instance, most of the documentation is written in markdown. It is a part of the source code. It is not elaborate, but quick to the point of boot strapping.

It welcomes contributors. The process of forking and contributing code is simple. In fact, the first think people handle is this simplification, before they set the project up in github.

If the leadership changes, the entire project can be cloned. There is no hidden information.

In contrast, in an enterprise, this is how it might be:

A local, special toolset that forces people to learn before they contribute

A special team who is supposed to maintain onboarding documentation, which often is out-of-date,

If one has to contribute to the project, it will take handholding of an existing (often busy) team member

No quick gratification for the new comer – which means there is no incentive for new comers to join the program.

In short, the open source way is sustainable because it is simple, it is welcoming to new comers so that they can sustain the system.

Sustainable organizations

The same principles work in creating organizations as well. Most organizations focus on customers and deliverables, which contribute to the direct metrics. The indirect metrics that enable these metrics in the long term may be focus for a while, but the moment they become less important, they fall by the way side. Because they are not sustainable, there is every danger of them becoming a wasteland of systems with lip service.

Most organizations do realize this problem. Their solution is to increase the focus on this problem. But, that does not solve the problem forever. Instead, it becomes like US medical healthcare. The joke is that people with rich plans end up getting bad treatment, as everybody wants to treat them for something. And, people with no insurance also end up getting bad treatment. Only with just the right kind of insurance, will we get the right treatment.

The idea is the focus should be creating such sustainable systems – the systems that can run without continued focus. These systems should run, as a byproduct of the main metrics. They should run because the traditions established. They should run because they serve selfish interests of people. They should run because the collective interests of the people to force others to confirm.

Checklist for sustainable systems

We started with the knowledge management system and how to make it sustainable. I never gave any answer to that question. Instead, I went on to explaining sustainability. Now, let us get back to sustainable knowledge management system. While we cannot define it, we may be able to help in creating a checklist to see if a system is sustainable. Obviously, we want it to be of low footprint, but that is not enough. The following check list may be a good start.

In the steady state the system should not require any additional resources to support it.

Until it gets to the steady state, the stakeholders should commit to it.

Every company claims to be customer centric these days. But, then, if you look back, companies always claimed to be customer centric. “We are focused on what the customer wants, right?” They say. In fact, it was and is the bed rock of American industry, we are told.

All true. In deed, organizations have been focused on the customers always. However, when it comes to IT, the story is not so straight forward. Consider the following evolution of IT.

Let us look into a brief background of this evolution, with some examples.

System Centric IT

As organizations started going beyond payroll systems to using computers in running the business, the primary concern is about the functionality of the applications. To large extend, the PCs of that era also influenced the purchase decisions: We want as many tick marks as possible in features.

Consider the IT manager trying to automate procurement. Any procurement application requires partner qualifications, tender management, contract management, PO generation, quote generation, etc. (I am not an expert in this area, but you get the gist). The IT manager looks at various products and makes a judgment based on several factors, including functional fitment. That is the how IT is built of several systems, selected for different functional capabilities.

What is the flipside of this kind of selection? Let us look at the problems and the ad-hoc solutions that IT had to deal with because of the fragmented systems in IT.

Overlapping functionality

The overlapping functionality of the systems in the enterprise leads to lot silly problems that IT vendors have been making money off for several years:

Of course, there is a simple solution: Buy one system that solves every problem of running a company. The big ERP systems (an excellent example would be SAP) promises exactly that. You get a unified view of the world, with one application (even if it is broken as modules) dealing with standardized data, clear delineation of responsibilities, and best practices about what functionality is to be implemented in what module.

Despite all these promises, enterprises still will have to build applications, buy different applications for different functionalities, and live with the consequences.

Process Centric IT

System perspective of IT enables an IT to automate all the core functions. Yet, when we look at an enterprise, more and more, the business sees itself as carrying out some processes. For example, procurement is a process, managed by a business unit.

This process view in a company has several benefits: firstly, companies can measure themselves by process. For instance, how long does it take to procure a new computer for a company? If that is the metric that needs to be lowered, then, people can modify the process, optimize the process, or simply train the people to process requests faster. Secondly, process is natural to companies, as they are optimizing for operations. Since operational responsibilities fall along the lines of process, it is convenient to create KPI’s based on the process.

In the world of IT that is built on systems, this process centricity poses a problem. Let us look at the ways:

If a process is entirely contained in a single application, it looks almost like a functionality, right? Then, what is the problem? The problem is process view demands certain uniform approach to process management. For example:

Process visibility: Where exactly is that request for new onboarding of a supplier stuck?

Process metrics: How long does it take to process this new supplier on-boarding?

Process delegation: Jane went on vacation. Can John who reports to her approve the request?

Process allocation: We have three people verifying the supplier bank information. That is the where the lag is. Can we increase the number of people handling that task?

Exceptional processing: We have an important project for which we need this new supplier. Can we hand-hold the process to move it past the gate?

If a process spans multiple applications, that creates additional problems. For example:

All the process management requirements become that much harder to implement. For example, consider delegation: You may have to repeat the same delegation in two different systems.

Integrating the processes is complex. We need to bring the data in sync between different applications.

Process optimization is complex. It is difficult to change two systems and keep them in-sync when the overall process is changing.

If a process is not fully automated, it creates even more problems:

Process management may be impossible, with manual steps. In particular, visibility, metrics, and allocation – all have to be done manually, leading to inconsistent results.

Process operations may be costly, because of the manual steps. In addition, it may be error prone and inconsistent.

With all these problems with systems view of IT in a process world, what is the solution?

BPM for process management in a company

BPM as a emerged as a general mechanism for business process management in an enterprise.

Consider the implications of the picture presented:

It is preferable to implement a process outside of the systems, in the BPM, if we are looking to manage effectively.

Even if the process is contained one application, it still may be preferable to implement in BPM.

BPM can provide additional benefits, in addition to process management:

customization of process, which may be simpler do in BPM than the application

Reconciling different data models – customers, partners, orders etc.

Automating simple tasks, without being custom systems – to full automation of the process.

The enabling technology is SOA, which lets applications work with a BPM system.

People Centric IT

Process centric view is fine for lot of purposes. It lets IT support efficiency drive in enterprises. It lets the companies focus on the processes that matter most. In fact, it gives the organizations a blue print on understanding which processes to customize and which processes to leave alone.

But, one important aspect that gets hidden is this: How do people use the systems? Are they happy to use the systems? Do these systems work in the context of the usage (like, say on the go, or in the office)?

As an analogy, consider a government office, say DMV. If you are there for license renewals, you stand in one line. If you are there for registrations, another one. Auto titles, yet another one. That is, users are tasked with figuring out where they should stand.

Much like that, most of processes typically have their own way of serving users. Even if we were to integrate them in one portal, the experience is different. In fact, users may be forced to learn lot of different interfaces depending on the context.

Or, consider a corporate website. Typically they are designed describing the way the company is organized. But, a visitor does not care about how the company is organized – they only care about how their interactions are organized.

When we start taking a user centric approach to IT, we should not expose the systems or processes to users. After all, systems are the way vendors developed the applications, and processes are the way organizations carry out their work. We should design the interactions of users with the internal systems and processes and focus on providing the user interface supporting the context of the usage.

What customer centricity means to IT

If a company decides to reshape their IT to incorporate customer centricity, what should it do?

First and foremost, it should realize that the most of the systems that it can acquire still will be operating in the old-fashioned mode. And, internal processes will remain the same. Until products and internal applications are retooled to customer centricity, we will have to implement customer centricity differently.

Designing for one

Ideally speaking, we should design an application for every user. After all, every user is different in the way they want to use the systems. But then, it is too costly for that level of customization even with personalized applications.

The right approach to that is through Extreme Segmentation. That is, we do not divide people into segments based on static attributes. Gone are the days when people are classified into established buckets such as “college educated over 50’s white male”. Instead, we provide several attributes of users that lets us group users into very many overlapping sets.

To support that many segments, we may want to retool the IT in the following way:

Self-provisioning, community-supported, discoverable applications

Easy to develop, deploy, and operate applications

Standardized applications catering to segment specific interactions.

I will be writing more about extreme segmentation in digital marketing soon.

Three main groups of users

While it is nice to talk about extreme segmentation, I find the following three groups of the users very important:

Each group of users use the IT systems differently. And, as such customer centricity would mean different for each group. We can consider other groups such as partners, operations people etc – their needs tend to be a combination of these three groups.

Standard approach to customer centric applications

Finally, let me summarize by describing a standard reference architecture that can be used for developing customer centric applications.

This is somewhat simplistic, but the essential principles are going to be same even in a full blown picture. In summary, the architecture uses:

API’s to build the experience on top of existing applications

Use modern UI technologies to design applications

Applications can replicate functionality, in catering to different sets of users.

Concluding remarks

In this note, we discussed the emergence of customer centricity in IT. We described what it means to IT and architecture. In subsequent notes, I will describe the following:

What extreme segmentation means and how it works in digital marketing

A prescriptive approach to creating a customer centric IT – an enterprise architecture approach

An in-depth look at architecting a customer centric application: concept, design, and implementation

What if I were to tell you that the e-commerce site you’ve been developing could be hosted on a static website that would offer it premier performance? Or that the document management system could be run on any plain, old static server like Apache, nginx? Stick with me and we’ll explore this topic. This information may be too basic for advanced developers, but for most managers, this introduction may be long overdue.

A bit of history

Long ago, most websites were static. All they served were pages – html, text, and images. Hyperlinks allowed us to navigate from one page to another. Most were content-rich websites containing information, entertainment, and photos. Now it seems unthinkable that clicking on the hyperlinks was the only way to interact with the sites.

As shown in the picture, the protocol between the web server and the user is fixed: http protocol. The way the webserver interacts with backend information (in the case of static web, it is html, css, js, images and such) is up to us. A static website serves this information straight out of the file system.

As the readers began wanting to navigate sites in interesting ways or interact by adding content with the websites (like comments), we began to need server side programming. Serving out of a file system won’t do. The application needs to understand complex requests and somehow use the information in the files (or databases) to serve the information. Since database is a popular choice to keep structured information, web started serving as interfaces to the database – types of simple crud operations. In fact, applications like phpMyAdmin even let people manage the databases on the web.

Coming back to the web applications I mentioned, — e-commerce, digital marketing, document management systems — all these websites are considered interactive, dynamic, and programmable, and therefore need a web application server. The moment you choose one platform (typically it means an app server, a language, a framework, a set of libraries), you’re stuck with it forever.

Let’s see how we can design our way out.

Static web with automated look and feel support

Let us consider a simple case of information presentation. In this setup, all the consumers want to do is read the information. In that case, why not stick to a static website?

We can. The advantages are obvious. First and foremost, a static website is simple. It does not require any moving parts. Any old webserver can serve static content. You can even run a static website, with a simple one-liner in Python (python –mSimpleHTTPServer in Python 2.7 or python3 –m http.server).

It is also incredibly fast. The server has to simply send pages – it does not have to do any other computation. As such, it can be sped up even more by caching the content locally, or even using CDN (content delivery networks). You can use special purpose webservers such as nginx, that are optimized for serving static content too.

Furthermore, it is simple to maintain. You can back it up in the cloud, move to a different provider, and even serve it from your own desktop. You do not need to back up a database or worry about software upgrades, portability issues, or even support.

But, what are the downsides? Let us consider the problems:

How do you create nice looking pages? Let us say that you want to create a standard look and feel. If you are the kind that use hand-coded HTML, you will cut and paste the HTML template and then edit it. Pretty soon, it becomes complex. If you want to change the look and feel, you will have to go through all the pages and edit the few lines.

How do you create standard elements of a website? For example, you want a menu scheme. If you add a new area, then you have to go to each page and add a link in the menu to this new area of website. Or, if you want to create a sitemap, you will find yourself constantly updating the sitemap.

Considering that a website has repetitive information depending on the context, static way of maintaining a website is cumbersome. That is why, even for the static websites, people prefer WordPress, Joomla, or Drupal – all of which are meant for dynamically creating websites.

In these systems of dynamic content generation, the content is generated based on the request. No matter what application you use, it will have following issues:

The application becomes slow: The server needs to execute the program to generate a page and serve the page. Compare it to the earlier scenario where the server merely had to read the file from the file system and serve! In fact, the server can even cache it, if the file doesn’t change often.

The application becomes complex: All the flexibility of combining content with the themes based on configuration comes at a price. The application server has to do many things.

Consider the following full description of the stack. Any web server that is generating pages dynamically, needs to depend on an execution engine. It can be an engine that interprets a language, or executes a binary. For any language, we need a framework. For instance, if we are using Javascript, we can use Express as the framework. This framework can do many things. At a bare minimum, it can route the calls – that is, it interprets any request and maps the request to the right response. Next, it needs to have libraries that deal with writing the logic to actually manipulate the data. To present the data, we need a templating engine.

Of course, you have fancy names like Model (the data), view (the templates etc), and the controller (the routing) for this kind of framework (MVC).

The problem with any stack is that once you are committed to it, it is incredibly difficult to change it. To make matters worse, the frameworks that let you start easily (for example, lot of php based frameworks) are not the best frameworks for enterprise strength requirements (like say Java, which is excellent in libraries that deal with lot of integration). Ask Facebook! They started with Php and had to lot of optimizations to get the performance they want.

How can we do better? Can we still use the static website, and support better usability of the website? Reduce the costs of maintenance? The answer is yes. If we were to separate the content generation and content delivery, we can get best of the both worlds.

From this activity, what did we gain?

We gained performance: The runtime has to serve static files, giving us several different options to improve the performance.

We gained simplicity, somewhat: Technically, it is possible to mix and match different systems of generation (which is somewhat like using different languages and compiling down to machine code).

Readers can observe that content delivery networks can do this job of caching generated content on-the-fly, giving us the illusion of static generation of content.

Static website generators

If we are going to generate static website, what options do we have? Let us see the alternatives.

Firstly, we can use the same dynamic content generators and dump out the entire site in static pages. That is, if we are using Drupal, we can dump all the nodes. Or, with WordPress, or any other content management site.

The challenge is, what if there is no unique URL for a content? Lot of content management sites offer multiple ways of accessing the static content. For example, Lotus notes was notorious for such URL generation. Then, the static content dumping can be difficult. More over, these systems are not meant for static website generation – the limitations keep showing up as you start relying on them.

Secondly, we can use WYSIWYG editors such as Dreamweaver. They can create static content, apply a theme, and generate the website. They come with several beautiful themes, icons, and support scripts as well.

The challenge is that these systems are not programmable. Suppose you are generating content from external system. These systems do not provide a way to automate the content ingestion and upgrading of the website.

Thirdly, we can use a programmable system that generates the content. These days, this is the favored approach. These systems generate or update the complete website just from text based files. You can programmatically import content, update the website and publish it to production – all through scripting. Furthermore, they offer standard templating, support for CSS and Javascript libraries.

The downside, of course, is that these systems are not mature. They are not general purpose either. Still, for an experienced programmer, this is a wonderful option.

There are several examples of the third type of generation systems. The most popular ones are the ones that support blogging. For instance, Jekyll is a popular static website generator, written in Ruby, with a blog flavor. The content is authored in markdown format. Octopress is built on Jekyll supporting blogs. In Javascript world, there are blacksmith, docpad, and a few more.

Out of all the contenders, for my use, I like hugo and docpad. Hugo is the simplest of the lot and extremely fast. Docpad is more versatile in supporting lot of different kind of formats, templates, and plugins. In hugo, all that I had to do was to create a hierarchy and drop in .md files as description. Based on the folder structure, it creates the menus, content layout, and the URLs. Docpad is a bit more complex, but it is also essentially the same.

Static web with high interactivity

There is a big problem with the earlier approach. Consider the example we were giving about a document management system: what if we want to search for the right document? Or, sort the documents by date? Or, slice and dice the material? Or, draw graphs, based on the keywords?

For all these tasks, historically, we depended on web server doing all the heavy lifting. Do you want to sort the documents? Send a request to the server. Do you want to filter the documents? Send another request.

While this kind of interaction between the user and the web server gets the job done, it is not terribly efficient. It is a burden on the server; it increases bandwidth requirement; it feels sluggish to the user.

Thankfully, the situation changed with advancement of Javascript. Turns out when any HTML page is digested by the the browser, it creates a data structure called DOM. With Javascript, you can query and change the DOM. That means, at run time, you can do sort, filter, display, hide, animate, draw graphs – all that with the information available to the browser.

With this kind of power, we can design interactive websites without going back to the server. With this additional requirement, how we develop web pages and what technologies we use will be different.

See the sequence diagram titled “JS based interactive sites”. The web server sends the data, html, and javascript the first time. After that, any interaction is handled by the script itself. It can compute the new content based on the user interaction, from the existing data and make the necessary modification to the page by changing the elements of DOM.

The implications of this design option are enormous. We do not need to burden the server; we do not need to overload the network; we provide quick interaction to the customer.

The range of interactions is limitless. For instance, if we are offering an electronic catalogue of books, we can search the titles, sort by authors, filter by publishing date, and so on.

In fact, these kinds of interactions are common enough to have several mature libraries supporting them. For example, for the tasks I mentioned in the previous paragraph, I would use dataTables. If I am doing the same with millions of entries, I would use Pourover by NYTimes (which used this library for their oscar award fashion slicing and dicing web page).

Static web for collaboration

For the kind of interactivity we discussed earlier, Javascript libraries work well. If you think about it, all those web pages are providing is read-only interactivity. What if you want read/write interactivity?

For example, if you have a website with lots of content. You want to provide people a way of adding comments. Take the case of earlier Octopress itself – we may want to add commenting capability to those blog posts. How would we do that? We certainly need server side capability to do that.

Fortunately, the server-side capability need not come from us. We can easily use 3rd party server for these capabilities. For instance, we can use disqus or Facebook for comments. We can use Google analytics to track the users.

In fact, the 3rd party server ecosystem is so strong these days, we can develop highly interactive websites, with just static content served out of our server. You can learn what other leading web companies are using on their web pages and what services they are using from http://stackshare.io/.

This action creates a checkout button. When the user checkouts, it creates a token and provides the browser. You do need to have some server side component that takes this token and charges the user – it goes beyond the usual static website I was describing. But, for most other less critically secure website, you can conduct the updates from the browser itself.

You find this code in several websites, which posts a click data to the Google server.

Admittedly, most services need server to interact with them, for security purposes. Nevertheless, the heavy lifting is done by some 3rd party web server.

Static website special case: Server less setup

A special case of static website is desktop website. Suppose you want to share a folder with your friends. Easiest way to put the folder in dropbox and share it with them. Now, suppose, you want to provide some interactivity, say, searching the files etc. What would you do? You could host the files in a website. Too much trouble. You could run a local webserver. But, that is too complex for most people. Why not run the site with file:// protocol, without requiring a server, directly opening a file in the browser?

This approach works surprisingly well. The workflow could be as easy as this:

Let people, including you, place the files in the shared folder.

Watch the folder on update (or, do it periodically) and run a script that generates the data file.

The user can open the index.html folder, which uses the data file to create a web page.

With suitable library (like datatables) the user can navigate, search and sort the files.

This is a simple poor man’s content management service. You can enhance the document authoring process to add all other kind of meta data to files so that you can make more effective slice and dice operations of the folder.

Static web for e-commerce: An exercise for you

Let us take it up one more notch. Can we design an entire e-commerce site as a static website (or, at least with minimal server components)? We want to be able to display the catalogue, let the users browse, discover, and even read/write the comments. In addition, we should let them add items to shopping cart and check them out. We may even want to add recommendations based on what you are looking at.

Now, how many items can we keep in the catalogue? Remember that images are in separate files. We only need the text data. Looking at general information, it is max 2K per item. While there is no limit to the amount of data browser can load, anecdotal evidence suggests that 256MB is a reasonable limit. So, 100,000 items can be displayed in catalogue, without problems. Remember that all this data and images can be served out of a CDN.

We can do one better. We do not have to load all the items at once. We can load parts of items, based on demand. Now, if the commerce site has different divisions, and the customer chose one of them, we only need to load that part.

If we can reduce the number of items to say 10,000 to start with, that makes it less than 20 MB, which is the size of a small video. So, it is entirely reasonable, for user experience perspective, to be able to load 20 MB for a highly interactive site.

What about other services? We can manage the cart in JavaScript. The actual checking out: payment, receipt, and communication to the backend need to be done an actual server. Anything less would make the system less secure. Anybody with knowledge of the JavaScript can easily spoof the browser – so, best not to make direct calls to the backend, from the browser that assumes any valid data from the browser. All you are doing is a providing the integration in the browser!

We can think of some more modifications. What if we design a mobile application? We only need to ship the deltas in the catalogue. After choosing the catalogue, the application can send a request to fulfillment with some additional credit card information.

Now, go ahead do the following tasks:

Draw the technical architecture diagram

Create a data model and generate sample data in JSON

Create a set of javascript programs to support the following activities

users browsing the catalogue

Adding items to the cart

Checking out cart (here, you might some server components – do it an a different server).

Searching the catalogue

Managing the cart

For additional credit, do the following

Cross-selling and upselling (people who bought what you looked for also bought the following; or, buy this additional item at a discount). Discuss the security implications.

Develop a mobile application that caches the entire catalogue on the device. Figure out a way to keep the catalogue data synchronized on demand.

Because of Facebook, I have been in constant touch with friends, acquaintances and even people that I did not meet. I am using this annual letter as a way of summarizing, introspecting, and filling in the gaps in my usual communications about technologies to friends. It is heavily slanted towards technology, not the usual intersection of business and technology.

There are three ways that I learn about technologies. One is by experimenting on my own. By actually coding, practicing, verifying hunches, validating ideas, and playing, I learn a bit. By talking to customers, sales people, engineering managers, and developers, I understand what the problems of application of technologies are. By reading books, news papers, and blogs, I try to see the interrelationships between different subjects and the technology influences in the modern world. Let me review from the annual perspective, what these three different influences taught me.

(Cloud) Container based technologies

I played quite a bit with container based technologies. Specifically, I tried various docker based tools. In fact, I setup systems on Digital Ocean that lets me create a new website, make modifications, and push to public, in less than 10 minutes. That is, using the scripts that I cobbled together (most of them are public domain, but I had to make some tweaks to suit my workflow), I can create a new docker instance, initialize with the right stack, provision reverse proxy, and push the content to that machine, install all dependencies, and start running the system.

Minimal, cloud-ready OS’s

From my experiments with docker, I see that the power of virtual machine that can run any OS is not much useful. I don’t want to think in terms of OS. What I want is a stack to program in. My OS should be cloud-aware OS. It should be minimal, should work well in a hive, should support orchestration, and should be invisible. Since I am unlikely to interactively work in it, I would place a premium on programmability, and support for REST services. Of course, it needs to secure, but since it is minimal OS, I want security at different layers.

Based on all these requirements, I see hope for CoreOS kind of OS. I don’t know if coreos is it, but something like that — a minimal, cloud ready OS is going to succeed. Companies like Google and Facebook already use such systems for their internet scale applications.

(Server side technologies) Node.js technologies

I entered this year having learnt a bit about node.js. I have a love-hate relationship (who doesn’t?) with JavaScript. On one hand, I love its ability to treat functions as first class objects, its polymorphism, its ability to extend objects etc. Its approach to type system is dynamic, flexible, and incredibly powerful.

Yet, I find lot of things to hate about JS. It scoping is infuriating. Its support for basics of large scale programming are absent. Its ability to type check are minimal. It leaves our crucial features of programming, letting users create competing idioms.

For small to medium systems that are cobbled together by REST services, node.js still is a quick way of getting things done. And, I like npm — it is even better than CPAN. I am not a fan of reinventing the wheel with all the tools like gulp, bower etc. The proliferation of these tools is confusing, putting off a casual user with their own standard tools. (Java was the original culprit in these matters.)

In Node.js, the technologies I played with are:

Express: Of course, duh. There may be better ones there, but I needed one standard one in my arsenal. This one will do.

Mongo: The power of using JavaScript across all the layers is seductive. I don’t have to translate from one format to another. And, let somebody worry about performance (well, not actually, but will do for the most part).

Usual set of libraries, involving, parsing, slicing and dicing, and template management.

Front end JavaScript

I have been frustrated with the front end technologies. There are too many frameworks. MVC? MVCC? And, the complex edifice makes my head swim. At the end, I am not able to make sense of it all. Thankfully, I see things changing. I am able to settle down on a few choices for myself — not because they are the best (in some cases, they are), but they are good for a generalist like me.

JQuery: I still find it the best way to manage a DOM from JS. I never bothered to learn the full DOM API, and I find JQuery convenient way of managing.

Web components: Specifically, I fell in love with Polymer. Why? Think about this. HTML has a standard set of tags. Suppose you want to introduce a new tag. You need to create JavaScript that parses the new tag and manages it for you. So, your code is strewn in a few places: the JavaScript code, CSS specifications, and the HTML. It is too messy, too unmaintainable, and more importantly, difficult to mix different styles.

Enter web components. You can create new elements as first class entries in DOM. The looks of the new element are specified via CSS in there itself. The behavior through the JavaScript also goes there. You expose specific properties and interactivity. The big thing is since it is first class element in DOM (as opposed to translated to standard elements through JS), you are able to reference it from JQuery and manage it just like you would a heading.

Since not many browsers implemented web components, we need a Polyfill, a way of mimicking the behavior. Thanks to Polymer, now we have JavaScript code that makes it appear that the browser supports this DOM behavior of web components. This polyfill intercept every call to DOM and translates appropriately.

Summary: It is slow and buggy at the moment. In time, it will take off, creating a nice 3rd party market for web components. It almost like what COM did for Microsoft.

Assortment of libs: While I did not reach my ideal setup (where the machine can speak IKWYM – “I know what you mean: language), there are several libs that help me now. Specifically, I like the templates with inheritance like nunjucks. I also use Lodash to make life simpler. And, async.js to use the async paradigm of JavaScript.

HTML looks and feel

As an early adapter of Bootstrap, I stand vindicated in pushing it for my corporate users. Now a days, almost all development we do is responsive, built on standard template driven development. Within that, I dabbled with a few design trends because I found them interesting:

Parallax Effect: You see pages where the background images roll slower than the text? It gives a 3D layering effect. It is particularly effective in creating narrative stories. When the images are also responsive, this 3D effect can make the web pages come alive. To take a look at some examples, see: http://www.creativebloq.com/web-design/parallax-scrolling-1131762

Interactive HTML pages: Imagine you are telling a story, where by changing some choices, the story changes. For examples, tabs change the content. But, imagine creating new content based on the user input, not merely showing and displaying the content. For instance, once we know the name of the reader, age and other details, it is easy to change the text to incorporate those elements. Or, if we know what they need, we can directly address in a sales dialog. While I did not carry out this idea to completion, I satisfied myself which the technology and the framework to do this interactive narrative sales tool. Naturally, this framework has a little bit of JS magic.

Auto generation of web pages: As I was writing text, converting the text to HTML and a suitable web page became an obsession with me. I finally settled down to using combination of md5, bootstrap, and yaml properties to generate a good looking web page to share with people.

If you are interested, please see these two blog posts, from yester years:

Static web apps

As I started seeing the advances in the web front ends, I see the possibilities of static web site. For instance, it is easy to create and run a static e-commerce application, with 10K or so SKU’s without any trouble. We can even have recommendation engine, shopping cart, various payment methods — all these thanks to web services and HTML5.

The following are the technologies that I found useful in the static websites.

markdown for html generation: For generic content, markdown is easy to author format. In fact, we can even author books in this format.

asciidoc for book authoring: For larger format HTML with more whizbangs, asciidoc is what I tried.

For the in-page manipulation of large number of objects, I find the following very useful:

pourover: The amazing slice and dice web sites for displaying large tables is done by pourover library from NY Times. I have high hopes for it. I think there are lot of innovative uses for this library, with its performance, and ability to cache the results.

One of my favorite problems is to develop a web applications, without any db in the backend, a pure static application that acts as a library interface. We can search, slice and dice the selection using various criteria. For instance, latest document about a particular technology, written by author XXX, related to YYY document.

Mobile & Hybrid application development

I have for a long while, bet on hybrid application development, instead of native application. Now, I concede that on the higher end market, native apps have an edge that is unlikely to be equaled by hybrid applications. Still, in the hands of an average developer, hybrid platforms may be better. They are difficult to get wrong, for simple applications.

This year, I was hoping to do native application development, but never came around to it. With polymer being not yet completely ready, I dabbled very little with Angular based framework called Ionic. It was OK for the most part.

Still, for simple pattern based development, I hold lot of hope in Yeoman. For corporate contexts, one can develop various scaffoldings and tools in Yeoman generator framework. That leads to compliant applications that share the standard look and feel without expensive coding.

Languages

In my mind, right now, there are three kinds of languages: ones that run on JVM — that includes scala, Java etc. Ones that translate to Javascript: these include Typescript, Coffeescript etc. And, the rest, like Go etc. Innovation in other languages has slowed down.

Despite that, the three languages I dabbled this year are: Scala for big data applications, specifically for spark; Python, again for data applications, specifically statistical processing, and Javascript, as I mentioned earlier. I liked typescript, especially, since it has support from Visual studio. I started on R, but did not proceed much further. Another language I played with a bit is Go, in the context of containers and deployments.

Data and databases

This year, I gave a 3 hr lecture in Singapore on bigdata, in the context of internet of things. I should have written it up in a document. The focus of that talk is what are the different areas of interest in big data are and what technologies, companies, and startups are playing in those areas.

This holidays, I experimented with Aerospike, a distributed KV database developed by my friend Srini’s company. Whatever little I tried, I loved it. It is easy to use, simple to install, and fast to boot. According to their numbers, it costs only $10 per hour to do 1 million reads per second on google compute platform. I will replicate and see how it compares against other databases like Redis and Cassandra that I am familiar with.

On the statistics front, I familiarized with basics of statistics, which is always handy. I focused on http://www.greenteapress.com/thinkbayes/thinkbayes.pdf to learn more. I followed Quora to learn about the subjects in Coursera. I wanted to get to machine learning, but that will have to wait for 2015.

On particular aspect of big data and analytics that fascinates me visualization. I played with D3 — it was of course the basis of most of the visualization advances that we see these days (http://bost.ocks.org/mike/). I am on the lookout for other toolkits Rickshaw. I will keep following it to see the new upcoming advances to make it more main stream.

Customer conversations

Since most these conversations have some proprietary content, I cannot give full details here. In general, the focus in on innovation and how corporations innovate in the context of established businesses. Typically, it is a mix of technology, processes and organizational structure transformations to change the way businesses are run. I will need to talk about in byte size postings some other time.

Wish you a happy 2015! May your year be full of exciting discovery! See you in 2015!

I looked at the way that the corporate training happens in technology areas and I find them wanting in several respects. This is my attempt at a bringing some best practices to corporate or institutional training.

Most corporations have training programs, especially the ones that deal with IT technologies. The goal of these trainings is to train people so that they can be useful, productive members of a project. This is meant for training competent engineers, craftsmen who can support delivery of projects.

A typical company gets two kinds of people to train:

People fresh out of college: They went through latent learning, without clear goals. They learnt different subjects in college, without clear understanding of how that knowledge is meant to be used. They tried to understand concepts without real practice.

People with a little experience: They worked on few projects, churning out code. Conditions in the industry is such that they were not exposed to quality code. Most of them do not understand the value of quality not do they understand what quality is.

Current training methodology: What is wrong with it?

Typically any corporate training follows a standard process: they get nominations on who to train. They hire or get instructors, who are experts in that area. They put all of them in a room, sometimes away from all the distractions. Over the course of a day or two, the instructors will take them through (with the aid of power point), the nuances and the details of the material. For example, if the training is on Java, the students will go though the static methods, public, and annotations etc. If the training is advanced, they might even cover some patterns of usage as a way of best practices. are

Typical evaluation of students are carried out through multiple choice questions which will test the users on the intricate details of the language. These questions cover a lot of trick questions to test the understanding of the language.

What are the problems with this approach? Let me count the ways:

It doesn’t train the students for what they encounter in the job. It doesn’t make them a better developer or project manager or whatever.

It only tests the book knowledge, which is almost one Google query away. It is that much cheaper to invest in a good quality internet connection.

After a few days of training, they forget the knowledge. Of course, they can always look up a book when they need to – but that was the situation they were in, to start with, anyways.

Even if we train them using actual small projects, these problems will continue to exist.

New way of training: What we want to achieve

The education market in the US is being continually disrupted. I am going to list a few lessons from those disruptions and later describe how to apply those lessons to our training process.

Let us see each of these lessons in turn.

Inversion of classroom and home

Known as Flip-teaching, this method of teaching became popular because of Khan Academy. The problem with class room training is that the teachers are going at lecturing at the same pace for everybody. When the students need help with homework, they are on their own.

In the flipped learning, the instructor doesn’t teach via lecture. There are enough number of books and videos that can teach the subject. Instead, the instructor, in the classroom setting, works with the group to solve problems.

Practicing for muscle memory

The ceramics teacher announced on opening day that he was dividing the class into two groups. All those on the left side of the studio, he said, would be graded solely on the quantity of work they produced, all those on the right solely on its quality. His procedure was simple: on the final day of class he would bring in his bathroom scales and weigh the work of the “quantity” group: fifty pound of pots rated an “A”, forty pounds a “B”, and so on. Those being graded on “quality”, however, needed to produce only one pot -albeit a perfect one – to get an “A”. Well, came grading time and a curious fact emerged: the works of highest quality were all produced by the group being graded for quantity. It seems that while the “quantity” group was busily churning out piles of work – and learning from their mistakes – the “quality” group had sat theorizing about perfection, and in the end had little more to show for their efforts than grandiose theories and a pile of dead clay.

If you want to learn JavaScript, write lot of code. Through writing the code, you keep improving. None of those “Learn X in 10 easy steps” can get you to this level of deep learning.

Hacking community realizes this aspect of learning well. Look at the following approach to learning by coding a lot: LxTHW (Learn X the hard way). The approach started with a popular book on Python, called Learn Python the Hard Way. If you follow that book, you don’t endlessly learn syntax and nuances of the language. Instead, you will code, and you will code a lot. It starts out this way:

Do not cut and paste. Type the code as is.

Make the code run.

Now, modify the code to solve slightly different problems. And, make that code run.

Keep repeating till you internalize all the concepts through practice.

In fact, this kind of learning through practice is applied by several people successfully. An example that I found very impressive was the case of Jennifer Dewalt. On April 1st, 2013, she started on a project to develop 180 websites in 180 days – one each per day. All these websites are simple enough to be coded in a day. With practice she got better and better; you can see the progress of her websites for yourself.

Even more experienced programmers like the inventor of JQuery, John Resig, feels that writing code everyday helps him keep his skils. Here is the famous blog post that he wrote: http://ejohn.org/blog/write-code-every-day/

In summary, our courses should not be teaching tricks and nuances of languages, libraries, or tools; they should be teaching the people practicing the craft.

Attention to basics

The big obstacle is coding or practicing the craft is not having the right basics. Even when two people are practicing equally, the one with the better basic tools will win.

Unfortunately, most colleges do not teach the tools of the trade. They focus, rightly on the fundamentals. Yet, there is no part of the education that covers the good use of tools and ways of working.

If practicing is the way to teach, the students need to have the right infrastructure to practice. In fact, practicing on that kind of infrastructure teaches them on how to use such infrastructure. These days, the basic infrastructure should be:

Making use of the cloud

Making use of the right OS, editors, and IDE

Making use of the right tools for version control, bug tracking, requirements etc

Even for small projects, it is best to have at least a cloud based setup to try the code, a version control to keep the code history, and the right editor to train your muscles to use the right key strokes.

Quality through osmosis

We can get people to practice on the right infrastructure and churn out finished goods fast. But, will the output reach world-class quality?

While we made the argument that quality comes from quantity (through long practice), a more important ingredient is having right role models. That is where right mentorship can help. This is where intervention by a teacher that give feedback on quality of the output can help.

There are multiple ways we can bring this to practice – say if you are learning programming in JavaScript:

Especially, in the third step, and to lesser extent other two steps, good mentors can help. They can point out what the best practices are, idioms are, and why some practice is good.

Assessment through real-life tests

The current crop of testing, because of automation, is focused on multiple choice questions. Unfortunately, it only focuses on the nuances of the programming language or system. The world is far more complex; there is no single correct answer. Even if your problems were to come in the form of these questions, you could always find out answers from the internet.

In contrast, in real life, you will have to produce code or an artifact. Why not prepare you for that, during training? Here is an approach that works:

Pick a large enough problem that we encounter during our work. Lot of these problems require “programming in the large”.

Abstract it sufficiently so that it can be solved in limited time. For instance, you can choose only one module (say, adding new users to the system).

Write a full fledged system that solves that sub-problem.

If multiple people choose different parts of the system, then, we can have a fully functioning system at some point.

Training process

If you were to subscribe to this approach of corporate training, how would you start? Here is the approach I suggest.

Start with a clear statement of what you want the students to be able to do. This part is surprisingly difficult. For instance, do not say “I want my students to learn Java”. That does not give them a new capability. Instead say “I want them to solve the following kinds of problems”. Now, these problems can be your way of assessing their status after the training.

Create a pre-conditions for the material: Not just only for assessment, but as a way of setting expectations, you could ask the participants to do some tasks. For instance, if they are going to be doing JavaScript programming, they should know how to use Firefox or Chrome extensions. You could test for that.

Create a curated content of the material: Since there is so much of material online, create an annotated list of material. For instance, you could give a link to the articles, books, slideshare entries or Youtube videos, with some notes about what the expectations from that videos are. You could construct them as a series of lessons.

For each lesson, create a set of problems they should solve: In fact, more problems they practice, the better it is. If there is an entity like LxTHW, we could just follow that. If not, create a set of problems for them to solve so that the lessons really sink in.

Create a basic infrastructure with tools: Even before they start solving problems, provide them an infrastructure such that:

They can use the tools to develop the solutions

They can use the infrastructure to collaborate with others

They can use it to show the mentors

They can test their solutions

Provide the mentors: This part is tough. Just as we mentioned earlier, at the minimum, show them (embed in the curated content), what you consider as good quality to achieve.

Create a post-training evaluation: Crate a large enough problem for the people to choose a part of the problem to solve. Using the mentors see

how fast they are developing the solution (it indicates how they internalized the solution – practice makes for speed).

how good a solution they are developing (it indicates how well they learnt from the masters)

Create a community so that they can interact with each other post training:A training is not complete unless it is followed up. Since resources are difficult to get, use the power of community. Create a social space where people can discuss what they learnt even after they graduated from the course.

Concluding Remarks

I am no fan of corporate training; but I realize that not all people are self-motivated to learn and learn well. I think corporate training can be reformed to take advantage of the modern best practices such as incorporating internet material, using repetitive training, and intentional techniques in training, especially for acquiring technical capabilities. This note is an attempt towards that direction.

You are the CIO. Or, the director of application development. You hear about consumerization of IT, in different contexts. You hear about mobile applications developed by teenage kids in only months, and these apps are used by millions of adoring people. And, your business people are demanding why can’t be more like those kids.

What should you do? Retrain your staff on some of the consumer technologies? Get a partner who has the skills in the consumer technologies? Move existing applications to consumer-friendly devices? Are they enough? And, why are you doing all these anyway?

In last couple of years, I have been working with different IT leaders to evolve a definition and an approach to this problem. By training, I am a technology person – the kind that develops the consumer technology. By vocation, I help IT departments help adapt technology to meet their strategy. Being in both the sides of fence, I have a perspective that may be interesting.

Consumerization of IT: A bit of history

The as coined in 2001, consumerization refers to the trend of building application that are people centric. Have we not been doing that always? Yes and no. While we were developing the applications for people, our main focus was some where else. The focus was about either growth of the business (by managing the volume of data), automation of activities to speed up the processes, or automating the entire business value chains, or only recently, focusing on the customers.

When we were building applications earlier, we were building them for a purpose: to solve a business problem. People were another peace of the puzzle – they were meant to be a part of the solution, but not the purpose of the solution.

Enterprise IT application development

How are these applications developed? Take a look at the sample development process.

In the current traditional situation, the EA people map the needs of an enterprise to a technical gap, see if there is a packaged app, and either customize one or build a new one. The biggest questions often boil down to “Build vs. Buy” or, what to buy.

A few things that you will observe are these:

Applications take long time to develop: Typically they are large, and serve long term needs of the enterprise. Any other kind of applications are difficult to retrofit into this model of development. For example, if you want an application by marketing department for one-time event, existing processes of IT makes it difficult to offer that service. That is why, we find marketing is one of the prime movers behind consumerization.

Applications serve common denominator: They address most common needs of the people. If your needs are very different from others, they will be ignored, unless you are the big boss. No wonder, that IT departments still develop applications with “Best viewed on IE 6+” sticker.

Applications lag behind the market needs: Since the focus is to create the applications with longevity, the design uses tested technologies. The pace at which these technologies are evolving, this design decision makes the technology foundations obsolete by the time applications are delivered. For example, even today, IT departments use Struts in their design – a technology that is already dead.

Applications, developed based on consensus needs, lack focus: Since there is a going to be one large monolithic application meeting requirements of several groups with different needs, the applications lack focus. For example, the same application needs to support new users and experienced users. Or, it needs to support management interested in the big picture view and the workers interested doing the processing. Naturally, any application that is developed to such diverse and divergent needs ends up being unfocused.

Applications are expensive to develop: Compared to consumer apps, where we hear apps getting developed for a fraction of cost, the process and the “enterprise quality” requirements impose lot of additional costs on the application development.

That is, this process yields applications that are built to last. Let us look at how consumer applications are developed.

Consumer application development

Historically, consumer applications have been developed differently.

As you can see, in each era, the consumers are different; the focus is different; and the distribution mechanism is different. File it away, as this historic view is important as we look at consumerizing IT. Dwelling deeper into the current era, we see the following:

Consumer applications almost always are better focused on end results than the users needs. For example, take the case of Instagram. In its history, it discovered if it followed user needs and demands, it would end up being another FB clone. Instead, it decided to keep the focus on one metric: “How to get most number of photos uploaded by the consumers”. That focused design led to its success.

Consumer applications are also built in collaboration with the consumers. By creating a model of constant experimentation, feedback from the field, and ability to improve the application, without ramifications of user support, the creators of the applications are able to build systems that are “built for change”.

But, what are the disadvantages for the consumer applications, compared to enterprise applications?

Only interesting applications get developed: Go to Apple’s app store, and you find so many applications around weather apps or gaming apps. You do not find enough applications to do genome analysis. Developers are impatient with problems they do not understand, or the problems that require lot of knowledge to solve.

Capabilities may be replicated in many applications: The strength in consumer applications, namely catering to different groups of people, means some core functionality gets repeated in applications. Instead of getting high quality apps, we might end up with lot of apps that are mediocre.

Lack of uniformity in solutions (depends on the platform): While some platforms are famous for creating a uniform and consistent experience, the applications themselves, provide fragmented experience. Unlike enterprise applications, they lack control or governance.

Consumerization: Why should IT care?

We established that enterprise applications and consumer applications have different focus. We also established that they are built, distributed, and operated differently. Still, why should IT care about consumer applications? Why should it consumerize itself?

I can think of three reasons.

Consumer focus of the businesses

Several service industries like retail, banking, entertainment, music, and health deal with consumers daily. Their business models are being disrupted by startups that bring new technologies and new breed of applications. While IT does not exactly lead the business transformation, at least by bringing the right capabilities, IT can support businesses better.

Internal users as consumers

Demographics of the employees are changing. More and more young people are joining the workforce. They are used to different kind of experience using modern devices and modern applications.

Even the older people are used to consumer applications: they use Gmail at home, facetime on their IPad, Facebook on their laptop, and LinkedIn at work. They come to work and they use Exchange without the benefit of Bayesian spam filters; they use Lync video instead of facetime or Hangouts; they do not even have something like Facebook or LinkedIn at work.

By not exploiting the modern technologies and application paradigms, enterprises are risk losing productivity and ability to attract the right talent.

Cheaper and better Consumer technologies

Large investments in the consumer technologies are making them cheaper and better, at a faster pace than the enterprise technologies. For instance, git is improving at a faster pace than perforce. Those companies that took advantage of the cheaper alternatives in consumer technologies, reaped the benefits of cheaper and better infrastructure, application construction, and operations. Google built their data center on commodity boxes. Facebook leverages open source technologies fully for their needs. The following are the main reasons why the consumer technologies are often better choices than the enterprise grade technologies.

So, considering that enterprises are being pushed towards consumerization, how should IT react?

Consumerization: an IT perspective

The best course of action for IT is to get the best of the both worlds. On one hand, it cannot run business as usual without its control and governance. On the other hand, it cannot react to markets innovatively without the consumer technologies.

As we bring both these best practices together, we see some interesting facts emerge. At least for certain aspects of application domain, we see that old style of large scale application development does not work.

As consumerization increases, we end up with large number of small applications instead of small number of large applications. Of course, the definition of application it self is subject to change. For our purposes, consider any isolated, independent functionality that people use to be an application. Historically, we used to talk about modularizing the application. Instead, now, we break down large application into smaller pieces. Some of these smaller pieces may have multiple versions to suit the extreme segmentation that consumerization supports.

If we are moving towards such IT landscape, what does it mean to traditional activities? In fact, all the following are impacted by this aspect of consumerization.

Development

Life cycle plan

Deployment

Support

Governance

Enterprise Architecture

Let us look at some of these challenges.

Challenges in consumerization of IT

I see three challenges in consumerizing IT.

These costs are easy to rein in, if IT can bring in some of the consumer technologies. In the next section, we will describe each of the technology changes that can help IT address these challenges.

Coping with consumerization: A recipe for IT

There are four questions that we should ask ourselves as we are embarking on consumerization of IT:

Each of these questions require an adjustment to the way IT operates. Each of these key concepts need full explanation. Since this article already has grown long, I am going to be brief in describing the key concepts.

Pace layered architecture

The idea behind pace layered architecture is that different parts of the IT move at different speeds. For instance, front end systems move fast as the technology advances faster there. The ERP packages move slow, as they focus on stability. Based on this idea, we can divide IT systems into three groups:

Systems of record

Systems of differentiation

Systems of innovation

If we were to divide systems this way, we know where consumer technologies play a big role: systems of differentiation and innovation. To take advantage of this idea for consumerization, I recommend the following steps:

Platform based development

Typically, when we develop applications, we are given just a set of tools that we can use: Java, app server, database etc. Putting together these parts into a working application is left to the developers. At best, standard configurations are offered.

Most of the tasks developers need to do are standard: add security, add user authentication, add help text, add logging, add auditing, and so on. Considering that there are lot standard tasks that developers need to do, is there a way that we can reuse the effort?

We have been reusing different artifacts in several ways. We used libraries, templates, and frameworks. With the advent of cloud technologies, we can even turn into a platform that is ready for cloud. Platforms turn out to be useful in several ways: they standardize development; they reduce the costs; they reduce the time to get the systems ready for development; they improve quality of the systems.

In addition, within any enterprise, there might be standard best practices that can be incorporated into the platform. With these additions, we can enforce governance as a part of the platform based development.

There are industry standard platforms as well for specific purposes: Google platform, Facebook platform, Azure platform, and SFDC platform. Each of them offer different capabilities and can be used for different purposes. Instead of standardizing on one platform, an enterprise will have to classify its needs and plans, categorize its applications, and from that basis, devise multiple platforms for its needs.

Internally, Microsoft has positioned SharePoint services and Office 365 as such a platform. Coupled with .NET technologies, it can be a full platform for delivering user defined applications.

Backend as APIs

The potential of the platform can be fully realized if the enterprise data and functionality is available to the apps developed on the platform. For instance, the data locked in the ERP applications is valuable for many modern applications. Existing logic within the system is difficult to replicate elsewhere and may be needed by the application.

By providing this information, both data and logic alike, as API’s, we can enable internal application as well as external applications. In fact, the current application development paradigms around API based front end development offer several different frameworks for this style of development.

Using REST+JSON API’s, we can develop web as well as mobile applications from the same backend.

Modern app stores

Once applications are developed, they need to be operational for the people. There are four different aspects to putting applications to use.

There are several different ways such an app store or platform for delivery is handled historically. Popular choices for different ecosystems include, Apple’s app store, Google Play, FB Apps, etc. If we build it right, we do not have to restrict the app store to mobile applications alone. Instead, the same delivery and support mechanism can support mobile and web applications as well.

Concluding Remarks

Consumerization of IT is a desirable trend, if handled correctly. The right way to handle to bring the useful elements of consumerization to appropriate kind of applications. The core features from consumerization include conceptualization of apps via pace layered architecture, development via platforms, integration via API’s, and delivery via app stores.

All of us developers can write code. We need designers for the look,feel, page designs, and flows. To get the services of a web designer, often we need to do a prototype. If the design is bad, nobody may even see the potential of the website. For a lot of web apps, there may not even be enough budget to attract a designer.

What should the developers do? Can they do the web design by themselves? Even if they can’t do a world-class design, can they create reasonably good web pages? This note, created from some of the lessons I taught my developers, answers the questions.

Small rant: Historically, clothes were hand made. They were ill-fitting, expensive, and of poor quality. After Mr. Singer came up with machine, they were cheaper and had better quality. Over the time, they became reasonably well-fitted too. Still, a modern custom Italian suit is better than a mass produced suit.

Most people hiring UX designers want that Italian design better than the machine design. But, in reality, they are getting the ill-fitting, expensive medieval designs. For them, a better choice is a machine design – that is, a factory approach, using standards based designs, with mass customization. It is an engineering approach like that of Mr. Issac Merrit Singer’s. If you have the right budget, of course, you can go for high end tailors or high end designers.

If you have not read them already, please read the following blog posts:

If you are strapped for time, you can skip to the last section, where I give you a standard way you can develop web pages, with good chance of success.

We’re not going for the originality. We are looking to use standard resources, standard tools, and standard best practices. let us study the core elements of a web design: Styles and Trends, UI Elements, Interactions, Fonts, Icons, Colors, and Layouts.

Styles and Trends

This is my soon-to-be-outdated advice: follow Google style. Use affordance where you need instead of full flat design. Don’t go for Skeumorphic designs, as they are difficult to design and maintain.

Skeuomorphism: Imitating the real world. That is, when you find a notebook imitating physical notebook, that is skeuomorphic design.

There are positive points for this design, of course. The user instantly recognizes it by identifying with the real-world counterpart. For novice users, it is a worthwhile association. For example, if a novice user sees a notebook with familiar ruled yellow paper, they know what it is for.

But, the flip side is that it is difficult to design. And, it gets annoying quickly. For an experienced user, it is hurdle for frequent usage.

Follow the link to see the set of examples to understand how it can easily go overboard.

Even Apple realized the excesses of Skeumorphic design and began to prefer flat design. This design reduces clutter and simplifies the appearance. However, flat design loses affordance. That is, if there is a button, you press it. If there is patch of color, you don’t think of pressing. Still, it is the recent trend. You can find different themes for the standard layouts. I find it quite useful to present information.

For instance, the above picture is good illustration of information presentation using flat design. You can see lot of examples of the flat widgets in Flat-UI project.

How about bringing some affordances to flat design? See http://sachagreif.com/flat-pixels/ for ideas on how to do it. Simply put, you can do flat design, but add shadows, gradients, and even buttons with relief to retain the usability of the design.

UI Elements

A page is composed of several UI elements. Some are basic, a standard part of HTML, such as tables, buttons, forms etc. Some are built with these basic elements: menus, breadcrumbs etc. Here are some simple rules in creating a standard set of UI elements:

Do not design any UI elements. Borrow from a consistent set. Please see the layouts section for further details on standard set of UI elements that come with toolkits like bootstrap.

For higher-order UI elements, again, do not design your own ones. If the toolkit provides them, use them. If you must design, create a catalog of semantic elements you need and use that to guide a standard set of elements.

http://www.cssbake.com/ – more focus on the basic elements – these can be used to spruce up the ones that come with the layout toolkit.

Interactions

These days, the web page design is not static. Depending on the user interactions, we need to change the page. For instance, when the user selects a particular part of the page to interact, perhaps it makes sense to remove the unneeded parts of the page. Lots of these interactions are accomplished with JQuery and its plugins. Some of the standard interactions are table design and infinite scrolling that you see on Facebook.

To use the fonts well, you need to understand how to use sizing of fonts to your advantage. If you use the UI patterns appropriately, you will know how to use the right font size to do a call out, or promote. You also will understand how to use colors to indicate the information classification to the users.

Icons

Icons make information easily identifiable and usable. Even the simplest icons provide information quickly. For example, take Otl Aicher’s stick figures: he designed the icons for the Munich Olympics and changed the way public communication occurs through pictograms.

In web, icons play even bigger role. There are two ways to use icons:

Using icons as images: For instance, you can find many sets of icons that are free to use in your website. All you need to incorporate these icons is to download them and use their jpg/gif/svg in your website.

Using icons as font: The problem with using icon images is that you cannot manipulate them. For instance, you cannot resize them (in svg, you can, but in others, they lose fidelity). You cannot change the color (you need multiple sets). You cannot transform them. Visit https://css-tricks.com/examples/IconFont/ to understand how icon fonts can be colored, resized, slanted, shadowed etc.
If you are using icon fonts, you can start with: http://fortawesome.github.io/Font-Awesome/ that go well with bootstrap. Or, you could look at comprehensive library like: http://weloveiconfonts.com/

Still, if you need icons with multi-colors, you need to use the images.

Colors

Lot of engineers feel challenged when asked to choose the colors. They resort to bland colors that do not work well with each other. If you are choosing a theme designed by an in-house designer, or a theme provider, the choices would be made for you. If you need to customize a standard theme, you can consider the following:

https://kuler.adobe.com/create/color-wheel/ – the color wheel is a standard way of choosing a set of colors that go well together. There are different variations – monochromatic to triad, or complementary to analogous. Choose one color and play with the combinations that work well with that color.

http://colorco.de is also a nice interface to using color wheel. Feel free to select the type of the color scheme you want and move the cursor over the colors to vary the combinations.

Layouts

The layout of the elements of the page is a crucial step in web design. The earliest designs did not understand the web as a different medium than desktop applications and gave us the designs laden with the same kind of menu’s with dropdown choices. Web is a dynamic medium that can adjust based on the context, the user, and the situation.

There are two parts to the layout: what should be in the layout, and how they should be laid out.

Conceptual elements in a layout

What should be in each page or layout is a much bigger topic than this simple post. I will describe in a separate post. Meanwhile, here are the fundamental rules to remember:

Make the pages task oriented for most post. If we need exploratory elements, use them as right recommendations in the page.

Do not clutter the page with all possible choices. Give the choices that make sense only in that context.

Give the user ability to escape the current thread of flow.

Feel free to hide and show the elements on demand. That is, we do not need to get a new page for every change in the page.

Respect the URL – The user should be able to bookmark and land on a page and carry on transaction from there; or, the user can do back and forth among the URLs.

Set a few standard page elements that reduces the need to learn for the users.

Soon, in other posts, I will describe the process of designing the elements of a layout.

Physical layout

Physical layout part has become simpler. These days, if somebody is developing the front end, the layout should satisfy the following:

It should work on any (modern) browser: Browser wars are oh so ancient. The web page should be viewable on any browser. However, since most modern web technologies require modern browsers, we can assume usage of modern browser (thanks to the mobile revolution), that is beyond IE7. Companies like Google already are pushing the market beyond even IE9. Browsers like Chrome and Firefox keep themselves updated to support most modern features.

It should work on any form factor: The layout should support any size of the browser. Some devices can only support smaller size browser; some support different orientations; some support different resolutions. Our layout should work on all these varying sizes, orientations, and resolutions.

We can check for the browser size (and the device, while we are at it), and generate the appropriate HTML. This approach doesn’t work well with proliferation of variations in sizes. Besides, what if we resize (or reorient) the browser after we got the page? Reloading the page is so retrograde, and breaks user experience (think re-posting a purchase – not what user expects).

We can indicate using CSS on how to layout: that is, no absolute sizes – only relative metrics. Using the right kind of weights, CSS positioning, we may be able to achieve the design we want.

We can add JS that can redraw the page based on the size: By adding JS that can hide or show elements, we can enhance the CSS to support devices even better. For instance, why show a full side bar with a menu, when we are seeing it on a mobile device, where there is barely enough space to display the main content?

While those are typical choices, in practice, you will use one of the following frameworks. These frameworks incorporate CSS and JS to deliver responsive design:

Bootstrap: The most popular choice for responsive design. You can customize what you need and get only the bare minimum needed. As a bonus, you will get a fully integrated set of icons, widgets, JQuery and plugins, and ability to customize the L&F of the site.

Zurb Foundation: Very similar to Bootstrap. The approach is more of a toolkit – it lets you design what you want. It has limited set of UI elements, and is not as opinionated as Bootstrap is.

Pure css: If you cannot use JS (because of organizational policies of using Javascript), you can always use Pure which is a pure css based responsive layout.

There are several other layout frameworks like skeleton with minor variations on these categories. The popular ones like bootstrap come with standard themes. These themes add standard layouts, colors, fonts, images, icons, and even templates. For example:

Every once in a while, I get the urge to work with computers. I want to get my hands dirty, figuratively, and dig into the details of installation, configuration, and execution. This experimentation comes in handy when we discuss the trends in the enterprise. Typically, we neglect processes when we do small scale experiments, but that is matter for another time. Besides, it is really fun to play with new technologies and understand the direction these technologies are heading to.

My personal machine

I wanted to run virtual machines on my system, instead of messing with my own machine. Because I multiplex a lot, I want to have large enough server. That way, I can keep all the VM’s open instead of waiting for the VM’s to come up, when I need them.

Since my basic requirement is to have large amount of memory, I settled on Sabertooth X79 mobo. It can support 64GB, which is good enough to run at least 8 VM’s simultaneously. Someday, I can convert it to my private cloud instance, but till then, I can use it as my desktop machine with lot of personal VM’s running.

I have two 27” monitors ordered off ebay, directly from Korea. Each monitor, costing $320, offers 2560×1440 resolution, with stunning IPS display – it is the same as in Samsung Galaxy, but with large 27” diagonal size. These days, you can get them from even newegg.

To support these monitors, you need dual DVI – two of them. They do not support HDMI and VGA would negate all the benefits of such high resolution. The consumer grade reasonable one is built with GeForce GT 640, of which there are several.

Finally, I used pcpartpicker site (http://pcpartpicker.com/p/23MXv ) to put together all my parts and it showed if my build is compatible internally or not. Also, it helped me pick the stores where I can buy them from. I ended up ordering from newegg and Amazon, for most needs. I also had all other needed peripherals like Logitech mouse, webcam, and MS Keyboard etc. from before, which I used for my new computer.

Software

For software, I opted to use Windows 8.1, as I use office apps most of the time. I use ninite.com to install all my apps – they can install all the needed free apps. Here are some of the apps I installed using that app: Chrome, Firefox, VLC, Java, Windirstat, Glary, Classic Start, Python, Filezilla, Putty, Eclipse, Dropbox, Google Drive.

Since I needed to run VM’s on this machine, I had a choice of VMPlayer or Virtual Box. I opted for VMPlayer.

My cloud machine

While the personal machine is interesting, that was only to free up my existing 32GB machine. The cost of such a machine, with right components is less than $1000. As per software, I had the choice of using ESXi 5.5, Xenserver 6.2, or Microsoft hypervisor 2012 R2. All of them are free, which meets my budget.

I tried ESXi (VMWare VSphere hypervisor), which did not recognize my NIC on my mother board. I tried inserting the driver in the ISO from previous release, but even after recognizing the Realtek 8111 nic, it still did not work. Xensever, on the other hand, worked perfectly well with first try. Since yesterday, I have been playing with Hadoop based Linux versions in this setup.

If you want to try

It is fairly cheap to have your own private setup to experiment. Here is what you can do:

Get yourself a decent quad-core machine with 32 GB. You do not need dvd drive etc. Add couple of 3TB disks (the best is Seagate Barracuda, for the right price). If you can, get a separate NIC (Intel Pro 1000 is preferred, as it is best supported).

Install Xenserver on the machine. It is nothing but a custom version of Linux, with Xen virtualization. You can login like any Linux machine as well. The basic interface, though, is a curses based interface to manage the network and other resources. [Image courtesy: http://www.vmguru.nl/ – mine was 6.2 version and looks the same. From 6.2 version, it is fully open source.]

On your laptop, install Xencenter, which is the client machine for it. Xencenter is full-fledged client, with lot whizbangs. It has support to get to console for the machine and other monitoring help. We can use the center to install machines (from a local ISO repo), convert from VMDK to OVF format for importing etc.

It is best to create machines for it, as conversion is a little error prone. I created a custom Centos 6.4, 64bit machine. I used it as my minimal install.

When I installed it, the installation did not allow me to choose a full install. That is, it ended up installing only basic packages. I did the following to get a full install:

The console doesn’t seem to offer the right support for X. So, I wanted to have VNCserver with client running on my Windows box.

I installed all the needed RPM directly from the CD’s, using the following commands:

I added the CDROM as a device for the YUM repo. All I needed were a few edits in the yum.repos.d folder.

I mounted the CDROM on Linux (“mount /dev/xvdd /media/cdrom” : notice that the cdrom device is available as /dev/xvdd).

I installed VNC server and customized to open for my display size of 2260×1440.

In the end, I removed the peripherals, and made the server headless, and stuck it in the closet. With wake-on-lan configured, I never need to visit the server physically.

At the end, you will have a standard machine to play with, a set of minimal installs for me to experiment with on your XenCenter.

What you can do with it

Now, you do not have a full private data center. For instance, you don’t have machines to migrate the VM’s to, setup complex networking among the machines, and connect storage to compute servers. For even with this, you can do the following activities:

Setup a sample Hadoop cluster to experiment: It is easy enough to start with Apache Hadoop distribution itself so that you can understand the nitty gritty details. There are simple tasks to test out the Hadoop clusters.

Setup a performance test center for different NoSQL databases. And, do the performance tests. Of course, performance measurements under VM’s cannot be trusted as valid, but at least you will gain expertise in the area.

There is a lot of interest in moving applications to the cloud. Considering that there is no unanimous definition of cloud, most people do not understand the right approach to migrate to the cloud. In addition, the concept of migration itself is complex; what constitutes an application is also not easy to define.

There are different ways to interpret cloud. You could have private or public cloud. You could have just data center for hire or a full-fledged, highly stylized platform. You could have managed servers or instead measure in terms of computing units, without seeing any servers.

As we move applications to any of these different kinds of clouds, you will see different choices in the way we move the applications.

Moving a simple application

Let us consider a simple application.

The application is straightforward. Two or three machines run different pieces of software and produce a web-based experience to the customers. Now, how does this application translate to the cloud?

As-is to as is moving

Technically, we can move the machines as-is to a new data center, which is what most people do with the cloud. The notable points are:

To move to “cloud” (in this case, just another data center), we may have to virtualize the individual servers. Yes, we can potentially run whatever OS on whatever hardware, but most cloud companies do not agree. So, you are stuck with X64 and possibly, Linux, Windows, and a few other X64 OS’s (FreeBSD, illumos, smartOS and also variants of Linux).

To move to cloud, we may need to setup the network appropriately. Only the web server needs to be exposed, unlike the other two servers. Additionally, all three machines should be in LAN for high bandwidth communication.

While all the machines may have to be virtualized, database machine is something special. Several data bases, Oracle included, do not support virtualization. Sure, they will run fine in VM’s, but the performance may suffer a bit.

In addition, databases have built-in virtualization. They support multiple users, multiple databases, with (limited) guarantees of individual performances. A cloud provider may offer “database as a service” which we are not using now.

In summary, we can move applications as-is to as-is, but we still may have to move to X64 platform. Other than that, there are no major risks associated with this move. The big question is, “what are the benefits of such a move?” The answer is not always clear. It could be a strategic move; it could be justified by the collective move of several other apps. Or, it could be the right time before making the investment commitment to the data center.

Unfortunately, moving applications is not as easy. Consider the slightly more complex version of the same application:

Let us say that we are only moving the systems within the dotted lines. How do we do it? We will discuss those complexities later, once we understand how we can enhance the moving that treats cloud like a true cloud.

Migration to use the cloud services

Most cloud providers offer many services beyond infrastructure. Many of these services can be used without regard to the application itself. By incorporating into the processes and also adding new processes to support the cloud can improve the business case to moving to the cloud. For instance, these services include:

Changes to these processes and tooling is not specific to one application. However, without changing these processes and ways of working, the cloud will remain yet another data center for the IT.

Migration to support auto scaling, monitoring

If we go one step ahead, by adjusting the non-functional aspects of the applications, we can get more out of the cloud. The advantage of the cloud is the ability to handle the elasticity of the demand. In addition, paying for only what we need is very attractive for most businesses. It is a welcome relief for architects who are asked to capacity planning based on dubious business plans. It is even bigger relief to infrastructure planners who chafe at the vague capacity requirements from architects. It is much bigger relief for the finance people who need to shell out for fudge factor built into capacity by the infrastructure architects.

But, all of that can work well only if we make some adjustments in the application architecture, specifically the deployment architecture.

How does scaling happen? In vertical scaling, just move to bigger machine. The problem with this approach is the cost of the machine goes up dramatically as we scale up. Moreover, there is a natural limit to the size of the machine. If you want to have disaster recovery, you need to add one more of the same size. And, with upgrades, failures, and other kind of events, large machines do not work out economically.

Historically, that was not the case. Architects preferred scaling up as it was the easiest option. Investments into hardware went towards scaling up the machines. Still, with new internet companies, they could not scale vertically; the machines weren’t big enough. Once they figured out how to scale horizontally, why not use the most cost effective machines? Besides, a system might require the right combination of storage, memory, and compute capacity. With big machines, it wasn’t possible to tailor to the exact specs.

Thanks to VMs, we could tailor the machine capacity to the exact specs. And with cheaper machines, we could create the right kind of horizontal scaling.

However, horizontal scaling is not so easy to achieve. Suppose you are doing a large computation – say, factorization of large number. How do you do it on multiple machines? Or, if you are searching for an optimal path though all the fifty state capitals? Not so easy.

Still, several problems are easy to scale horizontally. For instance, if you are searching for records through large set of files, you could do the searching on multiple machines. Or, if you are serving web pages, different users can be served from different machines.

Considering that most applications are web based apps, they should be easy to scale. In the beginning, scaling was easy. None of the machines shared any state – that is, there is no communication among the machines was required. However, once J2EE marketing machine moved in, these application servers ended up sharing state. There are other benefits, of course. For instance, if a machine goes down, the user can be seamlessly served out of another machine.

Suppose you introduce a machine or take out a machine. The system should be adjusted so that session replication can continue to happen. What if we run one thousand machines? Would the communication work well enough? In theory it all works, but in practice it is not worth the trouble.

Scaling to large number of regular machines works well with stateless protocols, which are quite popular with the web world. If any existing system does not support this kind of architecture, it is not difficult to adjust to this architecture without wholesale surgery on the application.

Most data centers do monitoring well enough. However, in cloud, monitoring is geared towards maintenance of large number of servers; there is a greater automation built in; there is lot more log file driven automation. Most cloud operators provide their own monitoring tools instead of implementing the customer’s choice of monitoring tools. In most cases, by integrating into their tools (for instance, log file integration, events integration), we can reduce the operational costs of the cloud.

Migration to support cloud services

If you have done all that I told you to – virtualize, move to cloud, use auto-scaling, use the monitoring, what is left to implement? Plenty, as it turns out.

Most cloud providers provide lot of common services. Typically, these services operate better on scale. And, they also implement well-defined protocols or needs. For instance, AWS (Amazon Web Services) offers the following:

Given this many services, if we just go from machines to machines, we might just use EC2 and EBS. Using these services not only saves money and time, but eventually, ability to use trained engineers and third party tools.

Re-architecting a system using these services is a tough task. In my experience, the following order provides the best bang for the buck.

The actual process of taking an existing application and moving it to this kind of infrastructure is something that we will address in another article.

Re-architecting for the cloud

While there may not be the right justification for re-architecting the applications entirely, for some kind of applications, it makes sense to use the platform that the cloud providers offer. For instance, Google compute offers a seductive platform that offers the right kind of application development. Take a case of providing API for product information that your partners are embedding on their site. Since you do not know what kind of promotions your partners are running, you have no way of even guessing how much the traffic is going to be. In fact, you may need to scale really quickly.

If you are using say, Google app engine, you won’t even be aware of the machines or databases. You would use an appengine, and the APIs for big table. Or, if you are using any platforms provided by the vendors (Facebook, SFDC, etc.), you will not think of machines. Your costs will truly scale up or down without actively planning for it.

However, these platforms are suitable for only a certain kind of application patterns. For instance, if you are developing a heavy duty data transformation, a standard appengine is not appropriate.

Creating an application for a specific cloud or platform would require designing the application to make use the platform from the cloud. By also providing standard language runtime, libraries, services, the platform can lower the cost of development, I will describe the standard cloud based architectures and application patterns some other day.

Complexities in moving

Most of the complexities come from the boundaries of applications. You saw how many different ways the application can be migrated if self-defined. Now, what if there are lot of dependencies? Or, communication between applications?

Moving in groups

All things being equal, it is best to move all the applications at once. Yet, for various reasons we move only few apps at a time.

If we are migrating applications in groups, we have to worry about the density of dependencies, the communications among the applications. Broadly speaking, communication between apps can happen the following ways.

Exchange of data via files

Many applications operate on import and export (and transformation jobs in between). Even when we move a set of applications, it is easy enough to do these file based communication. Since file-based communication is typically asynchronous, it is easy to setup for the cloud.

Exchange of data via TCP/IP based protocols

In some cases, applications may be communicating via standard network protocols. Two applications may be communicating via XML over HTTP. Or, they could be communicating over standard TCP/IP with other kinds of protocols. X windows applications communicate over TCP/IP with X server. Applications can use old RPC protocols. While these protocols are not common anymore, we might encounter these kind of communications among applications.

To allow the communication to continue, we need to setup the firewall to allow such communications. Since we know the IP numbers of end points, specific ports, and specific protocols, we may be able to setup effective firewall rules to allow such communication. Or we can set up VPN between the two different locations.

It is easy to handle the network throughput; in most applications, the throughput requirements are not very high. However, it is very common to have a low latency requirement between applications. In such cases, we can consider dedicated network connection between the on-premise center and the cloud data center. In several ways, it is similar to handling multi-location data centers.

Even with a dedicated line setup, we may not be fully out of woods yet. We may still need to reduce the latency further. In some cases, we can deal with it by caching and other similar techniques. Or, better yet, we can migrate to modern integration patterns such as SOA or message bus using middleware.

Exchange of data via messages or middleware

If we are using middle ware to communicate, the problem becomes simpler. Sure, we still need to communicate between the apps, but all the communications go through the middleware. Moreover, middleware vendors are dealing with integrating applications across continents, across different data centers, and across companies.

ESB or any other variants of middleware can handle a lot of integration-related complexities. They can do transformation, caching, store and forward, and security. In fact, some of the modern integration systems are specifically targeted towards integrating with the cloud, or running integration systems in the cloud. Most cloud providers offer their own messaging systems that work not only within their clouds, but also across the Internet.

Database based communication

Now, what if applications communicate via database? For instance, an order processing system and an e-commerce system communicating via database. And, if e-commerce system is on the cloud, how does it communicate with on-the-premise system?

DB-to-DB sync has several special tools, since this is a common problem. If the application doesn’t require a real-time integration, it is easy to sync the databases. Real-time or near-real-time integration between databases requires special (and often expensive) tools. A better way is to handle the issue at the application level. That means we should plan for asynchronous integration.

Conclusion

Moving applications to cloud opens up many choices, each choice with its own costs and benefits. If we do not understand the choices and treat every kind of move as equal, we risk not getting the right kind of ROI from moving to cloud. In another post, we will discuss the cloud migration framework and how to create the business case, and also how to understand what application should migrate to which cloud and to what target state.

I attended Strata last week (Feb 11-13) in Santa Clara, CA, a big data conference. Over the years, it has become big. This year, it can be said to become mainstream — there are lot of novices around. I wanted to note my impressions for those who would have liked to attend the conference.

Exhibitors details

Big picture view

Most of the companies, alas, are not used to the enterprise world. They are from the valley, not the from the plains where much of these technologies can be used profitably. Even in innovation, there are only a few participants. Most of the energies are going in minute increments of usability of technology. Only a few companies are addressing the challenge of bringing Big Data to main stream companies that already invested in plethora of data technologies.

The established players like Teradata, Greenplum would like you to see big data as a standard way of operating along with their technologies. They position big data as relevant in places, and they provide mechanisms to use big data in conjunction with their technologies. They build connectors; they provide seamless access to big data from their own ecosystem.

[From Teradata website.]

As you can see, Teradata’s world center is solidly its existing database product(s).

The new comers like Cloudera would like to upend the equation. They compare the data warehouse with a big DSLR camera and the big data as a Smartphone. Which gets used more? While data warehouse is perfect for some uses, it is costly, cumbersome, and doesn’t get used for most places. Instead, big data is easy, with lot of advances in the pipeline, to make it easier to use. Their view is this:

[From Cloudera presentation at Strata 2014].

Historically, in place of EDH, all you had was some sort of staging area for ETL or ELT kind of work. Now, they want to enhance it to include lot more “modern” analytics, exploratory analytics, and learning systems.

These are fundamentally different views: While both see big data systems co-existing with data warehouse, the new companies see them taking on increasing role to provide ETL, analytics, and other services. The old players see it as an augmentation to the warehouse when unstructured or large data volumes are present.

As an aside, at least Cloudera presented their vision clearly. Teradata on the other hand, came in with marketese which does not offer any information on their perspective. I had to glean through several pages to understand their positioning.

A big disappointment is Pivotal. They ceded the leadership in these matters to other companies. Considering their leadership in Java, I expected them to extend Map Reduce to multiple places. That job is taken up by Berkeley folks with Spark and other tools. With lead in Greenplum HD, I thought they would define the next generation data warehouse. They have a concept called data lake, which is merely a concept. None of the people in the booth were articulate about what it is, how it can be constructed, what way it is different, and why it is interesting.

Big data analytics and learning systems

Historically, analytics field is dominated with descriptive analytics. The initial phase of predictive analytics was focusing on getting the right kind of data (for instance, TIBCO was harping on real-time information to predict events quickly). Now that we got Big data, it is not so much as getting the right data, but computing it fast. And, not just computing fast, but having the right statistical models to evaluate correlations, causations and other statistical stuff.

[From Wikipedia on Bigdata]

These topics are very difficult for most computer programmers to grasp. Just as we needed understanding of algorithms to program in the beginning, we need the knowledge of these techniques to analyze big data these days. Just as the libraries that codified the algorithms made them accessible to any programmer (think when you had to program the data structure for an associate array), new crop of companies are creating systems to make the analytics accessible to programmers.

SQL in many bottles

A big problem with most big data systems is the not having relational structure. Big data proponents may rile against the confines of relational structures, but they are not going to fight against SQL systems. Lot of third party systems assume SQL like capabilities from the backend systems. And, lot of people are familiar with SQL systems. SQL is remarkably succinct and expressive for several natural activities on Data.

A distinct trend is to slap on SQL interface onto non-SQL data. For example presto does SQL on Big data. Or, impala does SQL on Hadoop. Pivotal does Hawq. Hortonworks does Stinger. Several of them modify SQL slightly to make it work with reasonable semantics.

Visualization

Big data conference is big on visualization. The key insight is that visualization is not something that enhances analytics or insights. It itself is a facet of analytics; it itself is an insight. Proper visualization is the key to so many other initiatives:

Design time tools for various activities, including data transformation.

Monitoring tools on the web

Analytics visualization

Interactive and exploratory analytics

The big story is D3.js. How a purely technical library like D3.js has become the de facto visualization library is something that we will revisit some other day.

Summary

I am disappointed with the state of big data. Lot of companies are chasing the technology end of the big data, with minute segmentation. The real challenges are adoption in the enterprises, where the endless details of big data and too many choices increase the complexity of solutions. These companies are not able to tell businesses why and how they should use Big data. Instead, they collude with analysts, media, and a few well-publicized cases to drum up hype.

Still, Big data is real. It will grow up. It will reduce the costs of data so dramatically to support new ways of doing old things. And, with right confluence of statistics and machine learning, we will see the fruits of big data in every industry. That is, doing new things in entirely in new ways.