Ok, so we’re not really there yet, but it really looks like many big players are aiming towards this within the next few years. The list of softwares moved to the web can be made long. Although Google received a lot of press for their Google Docs and Google Spreadsheet, there are other at least a handful of equally interesting products. A company named GravityZoo aims to bring the entire desktop online. In their attempt, among many things they’re working on porting OpenOffice to the Internet. What makes this more interesting than Google Docs is that instead being a commercial product, they must release the source code of their product. What this means is that companies might be able to use this solution for their intranets, which means sensitive information never needs to leave the companies network. Although many might argue that if one uses Google’s commercial service, the data is still safe, even if it’s online. However, since many larger corporations IT policies strictly states that internal information is not allowed to leave local network, utilizing a web-based OpenOffice or their intranet will enable them to get the benefit of the web app, without sensitive information ever leaving the corporate network. Moreover, with a simple VPN solution, even road warriors will be able to take advantage of this solution.

No, lets look at the ups and downs of using web-apps instead of traditional softwares. When I think of web-apps, the first thing that comes to mind is the administrative aspects. One of the largest benefits of administrating web-apps rather than traditional apps is that you don’t need to configure each and every one of you desktop machines with the particular software. Although you probably want to install a more secure browser than Internet Explorer if they’re running Windows, this is really all you need to do on the clients. Another quite obvious benefit is the platform independence. If you’re web-app is well written, it should work in any browser on any platform, witch is a great thing, since you don’t have to spend money on porting your software to a variety of platforms. Moreover, if you have a variety of platforms, file sharing tend to be a hassle. If you’re running a web-port of OpenOffice, with built-in file-management, you don’t need to worry about this anymore.

So what’s the downside? I spent quite some time thinking of drawbacks of using web-apps, but could only really come up with one; that it might be less responsive. If you’re on a slow connection, lets say over the internet, it might be very annoying with the delay it causes. However, if you’re running the web-app on a local 100Mbit network, they delay of a well-written AJAX web-app should be quite small. I think that the largest obstacle to overcome is the mindset of the users.

Talking about web-apps, we at WireLoad are planning to make a web-port of FireFox. We also talked about porting this blog to the web…

If you’re a regular reader of our blog, you may remember an article a while back about a piece of software called Cacti. It’s a nifty little web-based program that gathers information from a variety of hardware using SNMP. Cacti then presents the data in easily readable graphs.

At the time of the article, I installed Cacti for one of the organizations that I administrate the IT infrastructure for. Not only did I get a better idea of the utilization of bandwidth and hardware on the servers, but I could also see how much CPU resources the workstations were consuming. Although I knew that the CPU usage on the servers was quite low, I didn’t anticipate that the CPU usage on the workstations was quite as low as it was.

The organization is quite a typical office environment with 20-some workstations running mainly our own software plus web, e-mail, word-processing and spreadsheet applications. The hardware is quite modern with CPUs ranging from 1.8 Ghz to 2.7 Ghz Celerons and RAM between 256 MB to 512 MB. All workstations are running Windows 2000 Professional. Before I installed Cacti, I thought that the CPU usage during day use would average maybe 30-40%, with some significant peaks pushing up the average. However, I was quite surprised to find out how wrong my estimate was. It turned out that the average CPU usage on these workstations was less then 10% for all machines, and less than 5% for most of the machines with only few significant spikes. It’s true that Cacti only polls information from the workstations every 5th minute, but it should still give a quite accurate value these passed months as I’ve been running it.

Sample of the monthly view in Cacti

With this data on hand it’s quite obvious that we’ve been over-investing in hardware for the workstations. Even though I would rather overestimate a little bit than underestimate, it seems my estimates were far too high. Even so, when purchasing new workstations, we’ve pretty much bought the cheapest Celeron available from the major PC vendors, so maybe it would have been hard to adjust the purchases even with these figures on hand.

After some thinking I came up with three possible ways to deal with this problem:1. Ignore it
I guess this is what most companies does. Maybe the feeling of being ‘future-proof’ is valued more than the fact that you have a lot of idle-time. The benefit of doing this is that you have modern hardware that is less likely to break than old hardware.

2. Buy used hardware
Most people in power would be scared of just thinking about this. However, since there is no really new low-end hardware available, this appears to be the only way. The first problem you will be facing is probably to find uniform hardware. As an administrator, you know how much easier it is to administrate 20 identical workstations, rather than 20 different workstations, both in terms of drivers and hardware maintenance.

The second problem I came thinking of is the reliability problem. Obviously 5-10 years old hardware is more likely to break than brand new hardware, and if it does, there is no warranty to cover it. However, if you buy used hardware your budget will likely afford you to buy a couple of replacement computers.

There are also security implications of buying used computers. Every modern company with an intelligent IT staff is really concerned about security, both software and hardware. If you buy used hardware, there is a chance that it might be compromised (hardware sniffers etc.) I guess the only way to deal with this problem is to carefully physically inspect all the hardware you purchase.

If you or your company do choose to buy used hardware, there are plenty of sources to do so. One of the more interesting pages I found was a company in Australia, called Green PC, which sells a variety of computers and peripherals for a reasonable price.

3. Donate idle-time (to internal or external use)
With the rise of clusters, distributed computing and virtualization, there are today plenty of ways to put idle-time to good use. One of the more famous projects that deal with this is [email protected], which is a project at Stanford University that uses the participants’ idle-CPU/GPU time to do medical research. More recently a project at Berkeley called BOINC created a program that lets the user choose between a variety of distributed computing projects within the same application. By participating in such project, the company will create positive publicity (if the participation is significant).

If your company isn’t interested in donating idle-time to charity/research, they might still be able to use the idle-time. If all your workstations are connected with a high-speed connection (preferably gigabit), you might be able to use the computers in a virtualization environment. However, this is doable in theory, but I don’t know how well this would work in reality. Another alternative might be to use the idle-time to internal computations. If your company is in the software-business, distributed compiling might be one way to use the CPUs more efficiently. If this is not interesting, there are plenty of distributed computing solutions that might be used for intranets for various calculations that the company might else use a 3rd party company to compute.

Hopefully you have a better idea of how you can use your idle-time more efficiently, but you should be careful though. Some people argue that today’s computers are not built to run at 100% utilization 24/7. This is a very valid point, since neither the components on the motherboard, nor the fans is likely to stand a 100% utilization for very long without breaking. Therefore it is recommended to try to find an optimal distribution algorithm that spreads the calculation over the nodes without pushing the individual workstations until they break. I have to admit that I don’t have any available data on how the life-time of the workstations will be affected by running these type of softwares, but I would guess that it will have some impact on the life-time.

To round up this article, I would like to discuss one question that is highly relevant: “Why are there no low-end, cheap computers available?”

If you go to Dell or HP’s homepage and look for their cheapest ‘office’ hardware, it’s still far more than what is required for most office use. So why is it this way? As I see it, there are several reasons for this, involving both the software and hardware manufacturers in a mutual effort to stimulate sales. Obviously, the hardware manufacturers want us to replace our computers as often as possible, since this is how they make their profits. The software manufacturers on the other hand, want to sell new versions of their softwares by implementing new fancy features, that is unlikely to add to productivity, but requires hardware upgrades to run properly.

Let’s say you’re onboard with my ideas, and that you decide to look for cheaper hardware, but still feel like used hardware is too risky. One possibility might be to go for some kind of ITX-solution. These comes with a less powerful CPU, and often includes everything you need for desktop usage, but costs less than regular computers. One benefit of using ITX boxes is that they are very tiny and light, which makes them cheap to store and ship internally.

Until a couple of days ago, my bookshelf was filled with binders with old lecture notes from school. The truth is that I don’t think I ever opened one of these binders after I finished the final for the class. Yet, I didn’t want to throw it all away, since it might come handy some day when I want to refresh my memory.

On the other hand these binders really bothered me. They took up space in the bookshelf that I could use for something more useful. So I thought, why don’t I digitalize these papers?

Prepare the documents you want to scan. That means figuring out how you want to group your documents and removing the staples. Since I was scanning lecture notes, the grouping was quite simple. Removing the staples is a boring job, but it needs to be done.

In the process of scanning…

2. Scanning your documents

This is the time consuming part. Depending on your hardware, the time the scanning takes varies a lot. With the scanner I was using (HP Scanjet 5590), one paper (front and back) probably took about 35 seconds in 150 DPI. If you have a scanner with ADF, it doesn’t really matter that much if it takes 10 or 40 minutes to scan a pile of papers, since you can go and do something else in the meantime.

Depending on the software you’re using, the file-output might differ. In the software I was using, the name ‘bus-law_0_0.jpg’ turned out to be working quite well. The first ‘0’ is for the sequence. If for some reason the scanning aborts, you can just continue with ‘bus-law_1_0.jpg’, and the files will still sort in order.

3. Preview and delete blanks

When you’ve scanned in one entire group of documents, select them all and drag them to Preview. Use the arrow-keys to browse through all the documents to make sure they look good. You might want to rotate some documents, or delete some blank pages. I found the shortcut ‘Apple + Delete’ very handy in Preview, since then I can delete the file from Preview, without having to go out in Finder.

4. Convert your documents to a PDF

Screenshot of PDFLab

Up to this point you just have a bunch of jpeg files in a folder somewhere. Since this is not very convenient when you browse notes, I wanted to convert every group to a single PDF-file. When doing my research I found a very handy software called PDFLab. The software is a freeware and works really well.

Download PDFLab and fire it up. Now go to the folder where you saved all those jpeg files. Select them all, and drag them to PDFLab. This might take a couple of minutes, depending on your hardware.

When the files are imported into PDFLab, sort them by name by clicking ‘Name’. Now look through the list. If you have file names that go above 100 (‘bus-law_0_0100.jpg’), the sorting might not be done properly, since the file ‘bus-law_0_0103,jpg’ is sorted before ‘bus-law_0_013.jpg’. If you experience this, you need to move around the files manually until they are in the proper sequence.

When you’re happy with the sorting, hit ‘Create PDF,’ and enter an output file-name in the dialog which appears. If the PDF was generated without any errors, you’re all set.

If you get an error message when generating the PDF, just hit OK, and try to create it again. If this doesn’t work, try to restart the software.

5. Delete/Backup the image-files

When you’ve made sure that your PDF is working fine, you can either delete you jpeg files or burn them to a CD just to be safe.

That’s it. You can now throw away all those papers into the recycle bin. The best thing is that you’re never more than a couple of clicks away from your documents.

Empty binders

6. Drawbacks

This solution is not perfect, but it’s sure better than having all those binders in the bookshelf or in a box somewhere. The main drawback of this is that the documents are not searchable. This could possibly be solved with OCR, but according to my experience, OCR is still not powerful enough to recognize all handwriting. OCR also tend to mess up documents which mix text and images. However, if I was able to scan these documents into a PDF with OCR recognition, this would be the optimal solution, since it would both be searchable and consume less space.

Spotplex is a new service that launched a few days ago. It is squarely positioned to be a kind of digg competitor. The spin is that instead of working with user voting, they let people vote with their feet. If a lot of people view a certain article in a day, then that article is deemed ‘popular.’ Simply put, popular articles end up on their front page precisely because they’re popular.

While there isn’t much material on the site detailing their intentions, the idea is presumably that the visitor count method will prevent rigging the game as what may happen with digg. (Kevin Rose and the digg team is fiercely fighting cheaters though.) On digg, supposedly people sell their vote, or work together in teams to promote certain articles to the front page. On Spotplex this is meant to be more difficult. On digg it may be sufficient to get 40 willing people with digg accounts to end up the front page. But to do the same thing on Spotplex, you would need to get thousands of people with unique IP addresses to surf to your page.

There will probably be some way to trick Spotplex too – someone with access to a lot of zombie computers could do it perhaps. But in general it should be harder. Spotplex has a lot more data to go on – they can check IP numbers, referrer addresses, browsers and so on and look for patterns among the thousands of views an article needs to get on the front page. They should be able to prevent at least all basic forms of cheating with much less effort than digg.

Another distinguishing characteristic of Spotplex is that they use AJAX like there’s no tomorrow. The front page is nothing but a frame with little windows and the javascript necessary to fill those windows with dynamically sourced data. The first thing you see when you come to the front page is in fact nothing of interest at all. Instead there will be three major panes which all have a subtle little ‘loading’ tag in the background.

Spotplex doing its thing: ‘Loading…’

Apparently this dynamic design is putting quite a strain on the servers Spotplex invested in initially. Although not a very scientific test, we’ve been checking in on the front page of Spotplex once in a while for the last couple of days and we have usually been greeted only with lots of spinning ‘please wait’ indicators. Most of the time these last for several long seconds and that’s after the actual page took a few seconds to load too. The site feels tired just loading the front page.

Unfortunately Spotplex seems to be having more trouble than that. When Playing With Wire was invited to join the first 1,000 blogs to be on Spotplex, we received an invitation code. That invitation code could be used on the Spotplex page to get a code number. Supposedly the same page was to provide HTML code meant to go on the actual blog pages, but there was some kind of issue and we didn’t get any. No matter, we contacted Spotplex support who gave us the code promptly. Next, we inserted the code on our pages – you might have noticed the Spotplex image in the side bar.

This seemed to work well initially, and the page for our code started registering both a little bit of our page views and what articles were currently popular. Alas, about two days into the test Spotplex ceased to count our views and our number of views for the last 24 hours steadily declined to 0 on the Spotplex site.

At the time of this writing the Spotplex front page is loading as slow as ever, the Spotplex ‘get the code for your blog’ page is still not producing any actual code, and a search for “www.playingwithwire.com” on Spotplex returns no results. So apparently Spotplex is still struggling with the basics of their service. They will need get on top of this quickly, because the greater problem demands attention: can they really prevent people from generating fake ‘views’ for their blogs? Before they can compete with digg at all, they will have to prove that spam won’t rule their front page.

Update 1: We contacted Spotplex and they let us know that they are working on a potential database problem affecting Playing With Wire.

The record and movie industry has expressed a lot of concern about copyright infringement lately. In the ideal world, these organizations would argue, anything ever made by any of their members would be forever copyrighted. Since they own this information, or ‘intellectual property’, nobody should be allowed to reproduce it without their explicit permission.

I have to say I agree absolutely. Without this form of protection, there would be no culture at all. Copyright is an essential part of society. What does distress me though is the laissez-faire attitude even these organizations have when it comes to enforcing this fundamental right of artists and creators in the world. While RIAA and MPAA have a lot of opinions they do not seem to walk the walk. In particular, there is a special copyright infringement technique through which perpetrators are virtually unhindered to reproduce materials without paying for this privilege. To any reasonable person it must be obvious that this is an unmaintainable situation. If I write a book, a blog entry, create a piece of music, or design a game, I should be allowed to reap the benefits of the hard work I put into these pursuits. If anyone coming into contact with this information product would be allowed to copy, retain and spread my product, they would be depriving me of a basic right to my own work. They would in fact be stealing my work.

In many cases these organizations do protect us artists. They will prosecute thieves of physical goods; they will sue criminals engaging in blatant copyright infringement online and elsewhere. But for some reason which I cannot fathom, they let one of the most commonly used techniques for copyright infringement today go unpunished.

What I am referring to is of course the theft of intellectual property by people with eyesight.

Without any enforcement whatsoever of applicable laws, these individuals are unhindered to make unlimited copies of any material they come across by using the technology of ‘bio copying’ – also known as ‘remembering things’ in layman terms. Even now as you read this, there are less scrupulous people also reading this very blog entry. And as opposed to you, dear reader, these users are at the same time storing the data for later reproduction using extremely sophisticated neural networking technology. At a later time these ‘pirates’, as they are known, will be able to freely reproduce important concepts, ideas or industrial secrets expressed in this entry.

And to my amazement nobody goes after them. “But it’s too hard to suppress this behavior,” it is argued. This statement holds no water with me. We can lock down computers with Digital Restrictions Management (DRM). We can shut down whole companies for producing software or hardware which enable copyright infringement. We can spend millions of tax payer dollars on hunting down illegal information trading. We can even impose economical sanctions on countries with too liberal copyright laws.

Surely this one problem should then be easy to resolve. A small modification of today’s neural networking systems should suffice; perhaps a little chip in the bio copying devices. The chip would prevent access to Stored Intellectual Property – also sloppily referred to as ‘memories’ – without proper authorization and correct dues paid. If that doesn’t work, we can just go after the producers (colloquially called ‘pregnant women’) of this technology. Strict laws, lawsuits and legal enforcement will stem this crime wave at the root.