Introduction and Motivation (1)

Web prefetching is also called next page request prediction. A direct application is in web page caching and link prefetching. Such system try to predict the next page a user will access in order to reduce bandwidth usage and load on the web server. This problem affects web servers cache performance and latency.Semantic-rich Markov models for Web Prefetching N. R. Mabroukeh and C. I. Ezeife 3

Introduction and Motivation (2)

Domain ontology on the web provides a useful source of semantic information that can be used in next page prediction systems. The availability of this information and the trade-off problem between state space complexity and accuracy in Markov models, trigger a need to integrate semantic information in web usage mining.Semantic-rich Markov models for Web Prefetching N. R. Mabroukeh and C. I. Ezeife 4

Introduction and Motivation (3)

We show two ways of integration: Integration of semantic information directly in the transition probability matrix of low order MM, solving the contradicting predictions problem. Using semantic information as a criteria for pruning states in higher order Selective MM (SMM).

Markov Models (1)

Given a sequence of web page views generated by a user browsing the world wide web. This sequence can be modeled as a set of pages, or a web session W = {P1, P2, ,Pl}, where Pi is a random variable representing the i th page view in W. The problem of next page request prediction is to predict the web page that will be accessed next, i.e. Pl+1.Semantic-rich Markov models for Web Prefetching N. R. Mabroukeh and C. I. Ezeife 6

Markov Models (2)

The probability that a user will access a certain web page next is based on the current state that he is visiting, resulting in a 1st-order MM, or the previous k states in the sequence, resulting in a kth-order MM. All transition probabilities among states are stored in an nXn transition probability matrix P, where n is the number of states in the model.

The page pl+1 that the user will most probably access next is given by where P is the set of all pages in the web site. The argmax operator returns the page with the highest probability. The contradicting predictions problem occurs when argmax returns more than one result with equal probabilities.Semantic-rich Markov models for Web Prefetching N. R. Mabroukeh and C. I. Ezeife 9

Related Work It was found out [3][4][7] that, as the order of the MM increases, so does the number of states and the model complexity. On the other hand, reducing the number of states leads to inaccurate transition probability matrix and lower coverage, thus less predictive power, and less accuracy. In All-Kth-Order MM [8], various Markov models of differing order can be trained and used to make predictions. Such that if the kthorder MM cannot make the prediction then the (k-1)th-order MM is tried, and so on. Selective Markov models (SMM) [4] only store some of the states within the model, as a solution to the trade-off problem.

Selective Markov Models

Start off with an All-Kth-Order MM, then a post pruning approach is used to prune out states that are not expected to be accurate predictors. Deshpande and Karypis in [4] provide three different criteria which might be used separately to prune states in the model: frequency, confidence, and error. They did not study the effect and the relation of domain knowledge and semantics on selective Markov models.Semantic-rich Markov models for Web Prefetching N. R. Mabroukeh and C. I. Ezeife 11

Domain Knowledge representation (1)

A standard ontology framework is used with OWL. Each web page is annotated with semantic information using OntoMat* annotizer, or a similar tool.* University of Karlsruhe http://ontomat.projects.semwebcentral.org

Domain Knowledge representation (2)

Web pages pi representing products are mapped as instances of ontology classes. For example, page p2 from WASD, contains the Canon PowerShot A2000 IS , which makes it an instance of the class of Digital Still Camera in the ontology.Semantic-rich Markov models for Web Prefetching N. R. Mabroukeh and C. I. Ezeife 13

Semantic Distance After mapping pages to classes, semantic distance Mpi,pj between classes is measured as the number of separating is-a edges in the ontology. A Semantic Distance Matrix M is an nXn matrix of all the semantic distances among all the n web pages in the sequence database.Semantic-rich Markov models for Web Prefetching N. R. Mabroukeh and C. I. Ezeife 14

Maximum Semantic Distance

Maximum Semantic Distance is the maximum allowed semantic distance between any two web pages. Maximum semantic distance is inversely proportional to the maximum level of relatedness a user would allow between two concepts. It can be user specified or it can be automatically calculated from the minimum support value specified for the mining algorithm, by applying it as a restriction on the number of is-a edges in the ontology graphSemantic-rich Markov models for Web Prefetching N. R. Mabroukeh and C. I. Ezeife 15

Semantics integration in MM We show two ways of integration: Integration of semantic information directly in the transition probability matrix of low order MM, solving the contradicting predictions problem. Using semantic information as a criteria for pruning states in higher order Selective MM (SMM).

Semantic-rich MM The semantic distance matrix M is directly combined with the transition matrix P of a Markov model of the given sequence database, into a weight matrix W. This weight matrix is consulted by the predictor software, instead of P, to determine future page view transitions for caching or prefetching. The Weight Matrix W can be defined as an n X n matrix resulting from combining the semantic distance matrix M with the Markov transition probability matrix P, as follows,

Assume that in the test set the user went through this sequence of page views <p2p5p1p3>. Looking at P, there is a 100% probability that the user will next view page p2. A problem that could arise here is contradicting prediction, for example, notice that P(p3|p1) = P(p4|p1), which means that there is an equal probability a user will view page p3 or p4 after viewing page p1. Thus, the prediction capability of the system will not be accurate in terms of which is more relevant to predict after p1, and there will be a contradicting prediction.0 0 0 0

Using Semantic Distance for State Pruning in SMM

In this case, an All-Kth-Order Markov model is built first, Next, states that do not contribute to the model, i.e. which have zero frequency, are pruned. Then, any state S k , having , where l is the number of j pages the user visited so far and j is a simple enumeration of the states in the model, such state will be pruned from the model. In other words, states that are more than away from each other are pruned out of the model. This results in a smaller state space size, and so less memory consumption, while still providing equal prediction accuracy as regular SMM.

20 15 10 5 0 1st-order Sem. 1storder 2nd-order AllKth-order FPSMM

DS-1 DS-2 DS-3

15000 10000 5000 0 5 20 50 70 90 110 DS-1 DS-3

Conclusions The performance of semantic-rich 1st Markov models is studied and compared with that of higher order SMM and SP-SMM. SP-SMM have 16% smaller size than FPSMM and provide nearly an equal accuracy. It was also found that semantic-rich low-order Markov models can overcome the problem of contradicting predictions. Future work includes: the development of a method that can gather better semantic information to be used than simple semantic distance, and investigating the benefits of pushing the ontology towards more semantic information that can aid inferencing, along with association rules in the recommendation phase of web usage mining.25