Tracking Users

, January 01, 2002

Marketers don't want to measure raw hits on a Web site. Increasingly, they want to categorize visitors and measure the significant things those visitors do on the site. Many want to track the effectiveness of promotions in real time and make adjustments instantly. They often want to take data about Web activity offline, combine it with their traditional data, mine it, and report on it. They want to improve advertising effectiveness, visitor loyalty, purchase rates, cross-sells, and up-sells. All this is fueling demand for a new generation of Web-site analysis tools that represent visitor behavior in terms that marketers understand.

High-Traffic Recording

On popular Web sites, traffic is increasing exponentially. Traditional Web-log
analysis can take too long to read and process large log files. Even recording
the raw data in a database is too slow.

Data-cube recorders don't actually store the raw event data at all. Instead,
the recorder creates new visitor and content categories on the fly, and assembles
a statistical model of visitor behavior as event data flows in (see
Figure 2). In OLAP lingo, the statistical
model is called a complete or partial "data cube." It lets marketers rapidly
roll-up and drill-down to see different views of the data. This is the method
Andromedia Aria has adopted and it accounts for the product's unique analysis
and reporting capabilities.

The advantage of data-cube recorders is that reporting on preanalyzed data
can be very fast. Furthermore, the statistical behavior model is typically much
smaller than a collection of raw events, requiring less disk space. Finally,
because a data-cube reporter writes less data and makes less frequent commits
to the database, it can keep up with extremely popular sites where other recording
techniques have difficulty.

The downside of data-cube recorders is that raw data isn't saved. If the recorder
was not set up to generate the desired visitor or content categories automatically,
it can be impossible to go back and regenerate the statistical behavior model
after the fact.

To enable regeneration, Aria also provides a log recorder that creates compressed
output files -- optionally deleting files older than a preconfigured retention
period -- along with a log reader. Compressed-log recorders are inappropriate
for most production traffic analysis, because decompression consumes precious
processor time during the (also processor-intensive) analysis phase.

However, sites can run a data-cube recorder and a compressed-log recorder simultaneously,
giving them the best of both worlds. The data-cube recorder provides on-the-fly
data analysis for realtime reporting, while the compressed-log recorder lets
a Webmaster restructure categories and regenerate the statistical model afterwards,
if necessary. In practice, the combination is not used that often. Most Webmasters
set up category-generation correctly in advance, and don't want to waste disk
storage and processor time creating compressed log files. -- DG

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!