Notes on usability and related things by a project manager who manages electronic publishing projects.

May 28, 2012

When they were younger, both my kids were fans of Dr Seuss's Sleep Book (as I was in my turn). Among the delights for a small bed-time reader, Dr Seuss provides real time statistics about the number of people currently asleep, and (like a good statistics provider) publishes his methodology:

"We find out how many, we learn the amount,

By an Audio-Telly-o-Tally-o Count.

On a mountain, halfway between Reno and Rome,

We have a machine in a plexiglass dome

Which listens and looks into everyone's home.

And whenever is sees a new sleeper go flop,

It jiggles and lets a new Biggel-Ball drop.

Our chap counts these ballls as they plup in a cup,

And that's how we know who is down and who's up."

There's also a wonderfully goofy illustration of the machine, and I think it was this, rather than any predestination to work on usage statistics that made this one of my favourite parts of the book when I was a child.

In the real world, web usage statistics sometimes seem to offer the power (and intrusion) of the Audio-Telly-o-Tally-o Count, only to snatch it away again, and offer subtly different statistics, with various caveats.

As an example, suppose you own a website, and understandably want to know "how many visits did people make to my site in the last week?"

If you had an Audio-Telly-o-Tally-o Count, your "chap" would magically listen and look into everyone's home or ofice, find people actually visiting the site ...and then all that remains is to count the Biggel-Balls.

Of course, that's not what a usage stats tool does.

When someone requests a page from your website, their browser sends one or more requests to your website's "webserver" to send the text, images and other content that the customer needs to view the page. Along with that request comes some information about the customer's computer - its IP address, operating system, screen size, and some information about the kind of browser the customer is using. It also often tells us the URL of the page the customer came from. The webserver can record all this (in a file called a "server log") along with the time of the request, for analysis later. This combination of facts is fairly unique to the computer - think of it perhaps as being like a footprint. When the customer requests the next page, all this happens again, creating a further "footprint". As the customer visits further pages, his or her computer creates a series of further "footprints" - the process of following their visit for analysis purposes is a bit like following a trail of footprints down a beach.

For competeness here I should say that not all analysis runs from server logs. In a popular alternative the webserver includes a small program with each webpage that the customer's browser runs when it assembles the page. The "small program" (used for example by Google Analytics) causes a message to be sent out to a log file for analysis later. And there are other methods. For the purposes of the discussion here, it comes to much the same thing.

As another aside, it is of course sometimes possible to requre every user to log in, and then to follow their individually-identified activity with a cookie. That provides more detailed information, but is not always desirable (in some circumstances requiring people to log in makes them go away instead; not everyone will accept cookies, and so on).

Pushing on with the "footprints on the beach" analogy, it's worth noting that we have a science ficton or fantasy beach here - trails can suddenly start as if someone was teleported in by futuristic technology or magic (e.g. the customer came in from a bookmark, or typed the URL of our page rather than following a link that we can detect). Similarly, trails of footprints almost always suddenly stop (e.g. the customer stopped using their browser, or went to another site). The way the Internet works means that customers don't have to do anything formally to leave your site; they just stop requesting pages.

Imagine now a detective following these following these trails of footprints around on the beach. How do the clues compare with the Audio-Telly-o-Tally-o Count?

The detective has the following problems:

"Footprints" are fairly unique to a given computer, but not completely so. If Big Corporation Inc. has bought a batch of identical computers and successfully forbids its staff from customizing them in any way, then all the computers will have identical footrprints. The detective may struggle to sort out all those size 42 Converse Sneakers. The usual counting rule is to count all this as one user (technically one "unique browser") whereas the the Audio-Telly-o-Tally-o Count can magicallly see several people, and so drops several Biggel-Balls.

The Audio-Telly-o-Tally-o Count magically sees exactly where people stop using your website and do something else - and where they are still on the website, but not requesting pages. The detective only has the observation that the footprints stopped (the standard is to declare that a user session has ended if there are no more page requests for 30 minutes). Clearly this is arbitrary - the Audio-Telly-o-Tally-o Count might know that the user is still avidly reading a long web page, has broken off to answer the phone etc. So we might get one Biggel-Ball, as opposed to counting a new session each time there is a 30-minute gap.

The detective is counting "footprints" of a computer, not the people behind it. So imagine a public library which has one computer, on which people come and go all day, many of them looking at your website. The Audio-Telly-o-Tally-o Count magicallly follows this, counting the people coming and going. The detective, following computer-generated "footprints" does not know that a different human is now filling those shoes. If there's a 30-minute break, of course the detective assumes this is a new session, but if the queue at the computer is moving swiftly enough, then this won't happen often, and several different humans will be counted as one visit.

The Audio-Telly-o-Tally-o Count magically watches as a user switches from their desktop PC to their laptop or mobile device, or their computer at home, and can tell that this is one human continuing his or her visit. But each of these devices has a different footprint for the detective - the trainers suddenly stop, and a pair of heels carry on down the beach. So (unless the customer identifies themselves, e.g. by logging in) the detective counts a new visit each time the user switches device.

So, since we do not have an Audio-Telly-o-Tally-o Count, we can't count "visits" exactly in the common-sense meaning of the term. We can count "unique browsers" and "sessions" and combine those into a "visit" - a statistic which has some sources of error, but at least the major sources of error are known and the statistic is captured by a known and reproducible method. Note that the methodology is such that errors will usually result in under-counting: probably better for the business and its advertisers than getting an inflated idea of the traffic. It's currently the best that can be done, not due to the limitations of your usage analytics tool or usage analytics people, but due to problems with what you can actually measure, and the decisions you have to make to interpret this.

January 27, 2010

Well, today Apple are going to "unveil" the iSlate or iTablet. Once again they have been wonderfully adept at stoking a huge amount of enthusiastic speculation. Hats off to them for the marketing, but but I'm not going to write about it until the product is there.

Stoking speculation is fun and good for business, however. So imagine the excitement to come with these Apple products that, I can exclusively reveal, I have totally made up:

iLiner - thin, slivery shiny cruise ship

iPerdrive - futuristic propulsion system for cockney starships

iDoll - a child's toy, likely to be banned in fundamentalist families (engraving will be available from Apple stores if you want a graven iDoll)

October 08, 2009

Ask some people what car they drive and they'll answer "A Ford Focus Zetec 1.6" (for example). Ask others and they will say "er...a blue one?". Which is of course an OK answer - you need to know how to drive it, how to recognize it in car park and whether to refuel it with petrol or diesel. And that's probably all you really need to know for daily purposes, even if you use the car a lot.

In an interesting post on the Google blog, Jason Toff did a straw poll of his friends, asking them "what is a browser?" Some interesting results (90% of his poll knew which car they drove c.f. 50% knowing which browser they use, even though most of the sample spent more time online than driving):

Linked from Jason's post is an interesting (and entertaining) video in which someone from Google does a vox pop in Times Square New York asking people what a browser is, which one they use and whether they know the difference between a browser and a search engine. OK, if the interviewees had just been told they were speaking to Google this might have muddled them up a bit, but <8% of interviewees (not even Santa Claus) knew what a browser was. You get a strong impression that for a lot of folks Windows/Internet Explorer/Google is all part of "the blue one". Something which I (with 4 browsers on my system for work purposes) need to remember!

August 19, 2008

CAPTCHAs are "Completely Automated Turing Test To Tell Computers and Humans Apart" - an automated test that humans should be able to pass but 'bots will struggle with. A common example is registration systems that require you to type out words that you see in distorted form in a box, like this example (generated by the reCaptcha program). I've been wondering what those were called....

There's an obvious accessibility issue caused by these (if you can't see the distorted text, you can't complete the test of typing it out). In reCatcha's case, there is an alternative test as an attempt to deal with this - click the loudspeaker button to hear a sequence of numbers read out against some background noises, thereby creating an auditory equivalent of the "type these words captcha". You can also click the "refresh" icon on reCaptcha's interfacve to get new words, if the ones you see are too distorted. That would also help with my personal cause of stress when filling out these - being presented with words that have become ambiguous (e.g is it a 1 or an l? a 0 or an O? Is that letter upper- or lower-case? If i get the captcha wrong will I spend many minutes re-doing the data I have typed into the web page?).

The "stop spam read books" tag line in reCaptcha's UI refers to an interesting feature - the distorted words come from scanned literature (sensible because you presumably need such a large selection of words that the captcha can't be solved by guessing words or throwing a dictionary at it). reCaptcha use the typed user input not only for solving the captcha but to quality check the scans they have made of the literature, as part of an project to digitize it.

Sam Michael, who moderates an excellent set of email discussion lists from chinwag.com, has recently summarized a discussion about captchas, and added his own experiences, including this humorous "captcha gotcha", consequence of using words randomly culled from literature (for anyone unable to see the captcha, it came up with a couple of rude or unfortunate words...)

Sam's post about captchas also contains some other ideas about making it easier to tell whether a human or computer has submitted a web page:

have an invisible field - humans won't see it and fill it out, but an automated script will. My thought about this is that you ought ot think about whether screen readers will see the field and prompt any blind users to fill it in, however

require the web page to be open for a minimum time before submission, thereby foiling automated scripts that fill the page out much more quickly than a human could.

August 17, 2006

[In which I tell how I was caught out by an old "gotcha" and then go on to list useful resources about choosing colours for websites.]

Yesterday I saw something I have not seen for a little while. We were sat around a monitor reviewing pages of a website. The design uses tinted background boxes to help break the pages into logical groups of controls and input fields, but these boxes were not showing up at all.

Our first thought was that recent changes to the stylesheet had broken this bit, but when we found that we COULD see the tints on another monitor we realized that the stylesheet was not to blame at all; it was the choice of background colour, whcih some monitors coudl not render.

Back in the past (the 1990s, say), we saw this kind of thing much more. Computer monitors could only manage a few colours, and a consensual set of 216 of them was created, called the web safe colors. (This article from W3C schools outlines the basics of web safe colors, gives some history, and is a good introduction.) This limited palette of web safe colours is still the safest bet for a website - pretty much any monitor will display them, showing them pretty much how you wanted (though with some difference in brilliance etc. due to the manufacture of different monitors, their age, the environment in which the monitor site and so on, let alone variations in customer's vision).

These days, of course, most monitors can display millions of different colours. So you would expect that a much wider range of colours would be OK (or most probably OK) to use. As far as I know, though, there is no updated guide to the colours that can be considered safe, and (as I found yesterday) you can still get a surprise if the designer has chosen a non web safe color and you look at the site on a different monitor.

In this case, we had not been so silly as to rely completely on the background tint to make the page make sense (among other things that would have been very poor accessibility), but the page without the tint was missing a useful visual cue. We've changed to a web safe color now, which is probably what we should have done in the first place.

This is probably a good place to introduce some sites I find useful when dealing with colours:

I did have links to sites that let you convert the pantone colour that
your print colleagues want into the nearest websafe color, but these
sites are gone now (suggestions of a good one are welcome!). Being able to match to pantone is useful if you want to have the same colours in your websites and printed materials.

One final thought, which comes from an article on the pantone site called graphics - dare to go beyond web-safe colors. From the title, you can guess the author's stance on the issue. I'm not sure I agree with his idea that web safe colours are only relevant to those few customers who are still running a 256-colour monitor. But I like his idea of testing the colour scheme by taking your monitor and reducing the colour settings as much as you can*. Then check how your site looks. I doubt this is foolproof, but would expect it to be useful enough to be worth the few minutes it takes. Wish I'd thuoght of that.

While you have the control panel open, you might also want to try changing your screen resolution to experience the web site design at different screen sizes - that can be a shock too!

*To change color settings in Windows XP, find the Control Panel from the start menu; click Display; click the Settings tab and find the control that controls the number of colours. Make a note of how it is set at present so that you can put it back when done, then lower it as far as it can go (to "256 colors" if your monitor does this - my monitor won't go that low).

August 04, 2006

Recently I was working on a new new look for a website. Despite assurances that all the old pages would be redirected nicely, things went a little bit wrong it would seem, leaving us with a nice new website with some pages that were "404s" (404 is the code that HTTP returns when a page does not exist, and often the customer will see a message in their browser that includes "404"). Obviously it is very bad news indeed if customers and search engines can't find your pages. All was sorted by the boys in the server room fairly quickly, but I did learn a few things:Rather than have customers see a default 404 page provided by their browser, we quickly put in a special page - this apologises for the inconvenience, and offers some navigation options that we hope will be helpful - the primary navigation tabs of the site are there and if this particular site had site search that would be there too. There are a couple of gotchas in doing this: It is important to return a true 404 to any visiting browser or search engine bot, so that any pages that really have been discontinued drop out of the search engine results. It seems like a good idea at first to redirect customers straight to, say the home page or the site map but this is not a good idea - if it is the site map or home page that is broken, customers go round and round in a loop.Recently I saw an interesting refinement on this, when I got a 404 page from the Roxio site - not only do they offer an apology, but they also give you a code you can use to get a 10% discount on your next purchase.

Obviously, I wanted to check we'd addressed all the broken links. A useful tool for doing this was Xenu's Link Sleuth This is a free piece of software (the author invites users to make a donation to his favourite cause). Given the URL of your home page, it follows all the links and reports on whether they work or have gone bad. It produces a useful report that you can also use to build a site map.

Phew, so after a while harmony restored and we were back where we thought we would be on launch day.