This paper studies the problem of categorical data clustering,especially for transactional data characterized by highdimensionality and large volume. Starting from a heuristic methodof increasing the height-to-width ratio of the cluster histogram, wedevelop a novel algorithm – CLOPE, which is very fast andscalable, while being quite effective. We demonstrate theperformance of our algorithm on two real world datasets, andcompare CLOPE with the state-of-art algorithms.Keywords

data mining, clustering, categorical data, scalability

1. INTRODUCTIONClustering is an important data mining technique that groupstogether similar data records [12, 14, 4, 1]. Recently, moreattention has been put on clustering categorical data [10, 8, 6, 5, 7,13], where records are made up of non-numerical attributes.Transactional data, like market basket data and web usage data,can be thought of a special type of categorical data having booleanvalue, with all the possible items as attributes. Fast and accurateclustering of transactional data has many potential applications inretail industry, e-commerce intelligence, etc.However, fast and effective clustering of transactional databases isextremely difficult because of the high dimensionality, sparsity,and huge volumes often characterizing these databases. Distance-based approaches like k-means [11] and CLARANS [12] areeffective for low dimensional numerical data. Their performanceson high dimensional categorical data, however, are oftenunsatisfactory [7]. Hierarchical clustering methods like ROCK [7]have been demonstrated to be quite effective in categorical dataclustering, but they are naturally inefficient in processing largedatabases.The LargeItem [13] algorithm groups large categorical databasesby iterative optimization of a global criterion function. Thecriterion function is based on the notion of large item that is theitem in a cluster having occurrence rates larger than a user-definedparameter minimum support. Computing the global criterionfunction is much faster than those local criterion functions definedon top of pair-wise similarities. This global approach makesLargeItem very suitable for clustering large categorical databases.In this paper, we propose a novel global criterion function thattries to increase the intra-cluster overlapping of transaction itemsby increasing the height-to-width ratio of the cluster histogram.Moreover, we generalize the idea by introducing a parameter tocontrol the tightness of the cluster. Different number of clusterscan be obtained by varying this parameter. Experiments show thatour algorithm runs much faster than LargeItem, with clusteringquality quite close to that of the ROCK algorithm [7].To gain some basic idea behind our algorithm, let’s take a smallmarket basket database with 5 transactions {(apple, banana},(apple, banana, cake), (apple, cake, dish), (dish, egg), (dish, egg,fish)}. For simplicity, transaction (apple, banana) is abbreviated toab, etc. For this small database, we want to compare the followingtwo clustering (1) {{ab, abc, acd}, {de, def}} and (2) {{ab, abc},{acd, de, def}}. For each cluster, we count the occurrence of everydistinct item, and then obtain the height (H) and width (W) of thecluster. For example, cluster {ab, abc, acd} has the occurrences ofa:3, b:2, c:2, and d:1, with H=2.0 and W=4. Figure 1 shows theseresults geometrically as histograms, with items sorted in reverseorder of their occurrences, only for the sake of easier visualinterpretation.Figure 1. Histograms of the two clusterings.ca b c fd e a{ab, abc} {acd, de, def}a b c d d e f{ab, abc, acd}{de, def}clustering (2)clustering (1)H=1.67,W=3H=2.0,W=4H=1.67,W=3H=1.6,W=5

We judge the qualities of these two clusterings geometrically, byanalyzing the heights and widths of the clusters. Leaving out thetwo identical histograms for cluster {de, def} and cluster {ab, abc},the other two histograms are of different quality. The histogramfor cluster {ab, abc, acd} has only 4 distinct items for 8 blocks(H=2.0, H/W=0.5), but the one for cluster {acd, de, def} has 5, for

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and thatcopies bear this notice and the full citation on the first page. To copyotherwise, or republish, to post on servers or to redistribute to lists,requires prior specific permission and/or a fee.SIGKDD ’02, July 23-26, 2002, Edmonton, Alberta, Canada.Copyright 2002 ACM 1-58113-567-X/02/0007…$5.00.

the same number of blocks (H=1.6, H/W=0.32). Clearly,clustering (1) is better since we prefer more overlapping amongtransactions in the same cluster.From the above example, we can see that a larger height-to-widthratio of the histogram means better intra-cluster similarity. Weapply this straightforward intuition as the basis of our clusteringalgorithm and define the global criterion function using thegeometric properties of the cluster histograms. We call this newalgorithm CLOPE - Clustering with sLOPE. While being quiteeffective, CLOPE is very fast and scalable when clustering largetransactional databases with high dimensions, such as market-basket data and web server logs.The rest of the paper is organized as follows. Section 2 analyzesthe categorical clustering problem more formally and presents ourcriterion function. Section 3 details the CLOPE algorithm and itsimplementation issues. In Section 4, experiment results of CLOPEand LargeItem on real life datasets are compared. After somediscussion of related works in Section 5, Section 6 concludes thepaper.2. CLUSTERING WITH SLOPENotations Throughout this paper, we use the followingnotations. A transactional database D is a set of transactions {t1, ...,tn}. Each transaction is a set of items {i1, ..., im}. A clustering{C1, ... Ck} is a partition of {t1, ..., tn}, that is, C1∪

…

∪

Ck={t1, ..., tn} and Ci≠

φ ∧ Ci∩

Cj= φ for any 1 ≤ i, j ≤ k. Each Ciiscalled a cluster. Unless otherwise stated, n, m, k are usedrespectively for the number of transactions, the number of items,and the number of clusters.A good clustering should group together similar transactions. Mostclustering algorithms define some criterion functions and optimizethem, maximizing the intra-cluster similarity and the inter-clusterdissimilarity. The criterion function can be defined locally orglobally. In the local way, the criterion function is built on thepair-wise similarity between transactions. This has been widelyused for numerical data clustering, using pair-wise similarities likethe Lp((Σ|xi-yi|p)1/p) metric between two points. Common similaritymeasures for categorical data are the Jaccard coefficient (|t1∩t2|

/

|t1∪t2|), the Dice coefficient (2×|t1∩t2|

/

(|t1|+|t2|)), or simply thenumber of common items between two transactions [10]. However,for large databases, the computational costs of these localapproaches are heavy, compared with the global approaches.Pioneered by Wang et.al. in their LargeItem algorithm [13], globalsimilarity measures can also be used in categorical data clustering.In global approaches, no pair-wise similarity measures betweenindividual transactions are required. Clustering quality is measuredin the cluster level, utilizing information like the sets of large andsmall items in the clustering. Since the computations of theseglobal metrics are much faster than that of pair-wise similarities,global approaches are very efficient for the clustering of largecategorical databases.Compared with LargeItem, CLOPE uses a much simpler buteffective global metric for transactional data clustering. A betterclustering is reflected graphically as a higher height-to-width ratio.Given a cluster C, we can find all the distinct items in the cluster,with their respective occurrences, that is, the number oftransactions containing that item. We write D(C) the set of distinctitems, and Occ(i, C) the occurrence of item i in cluster C. We canthen draw the histogram of a cluster C, with items as the X-axis,decreasingly ordered by their occurrences, and occurrence as theY-axis. We define the size S(C) and width W(C) of a cluster Cbelow:∑∑∈∈==CtiCDiitCiOccCS)(),()()()( CDCW =

The height of a cluster is defined as H(C)=S(C)/W(C). We willsimply write S, W, and H for S(C), W(C), and H(C) when C is notimportant or can be inferred from context.To illustrate, we detailed the histogram of the last cluster in Figure1 below. Please note that, geometrically in Figure 2, the histogramand the dashed rectangle with height H and width W have the samesize S.Figure 2. The detailed histogram of cluster {acd, de, def}.H=1.6c fd e a3210S=8W=5occurrenceitem

It's straightforward that a larger height means a heavier overlapamong the items in the cluster, and thus more similarity among thetransactions in the cluster. In our running example, the height of{ab, abc, acd} is 2, and the height of {acd, de, def} is 1.6. Weknow that clustering (1) is better, since all the other characteristicsof the two clusterings are the same.However, to define our criterion function, height alone is notenough. Take a very simple database {abc, def}. There is nooverlap in the two transactions, but the clustering {{abc, def}} andthe clustering {{abc}, {def}} have the same height 1. Anotherchoice works better for this example. We can use gradient G(C) =H(C) / W(C)= S(C) / W(C)2instead of H(C) as the quality measurefor cluster C. Now, the clustering {{abc}, {def}} is better, sincethe gradients of the two clusters in it are all 1/3, larger than 1/6, thegradient of cluster {abc, def}.To define the criterion function of a clustering, we need to takeinto account the shape of every cluster as well as the number oftransactions in it. For a clustering C

= {C1, ..., Ck},we use thefollowing as a straightforward definition of the criterion function.∑∑∑∑====×=×=kiikiiiikiikiiiCCCWCSCCCGProfit11211)()()()(C

In fact, the criterion function can be generalized using a parametricpower r instead of 2 as follows.∑∑==×=kiikiiriirCCCWCSProfit11)()()(C

Here, r is a positive1real number called repulsion, used to controlthe level of intra-cluster similarity. When r is large, transactionswithin the same cluster must share a large portion of commonitems. Otherwise, separating these transactions into differentclusters will result in a larger profit. For example, compare the twoclustering for database {abc, abcd, bcde,cde}: (1) {{abc, abcd,bcde, cde}} and (2) {{abc, abcd}, {bcde, cde}}. In order toachieve a larger profit for clustering (2), the profit for clustering(2),4247247×+×rr, must be greater than that of (1),44514×r.This means that a repulsion greater than ln(14/7)/ln(5/4)≈3.106must be used.On the contrary, small repulsion can be used to group sparsedatabases. Transactions sharing few common items may be put inthe same cluster. For the database {abc, cde, fgh, hij}, a higherprofit of clustering {{abc, cde}, {fgh, hij}} than that of {{abc},{cde}, {fgh}, {hij}} needs a repulsion smaller than ln(6/3)/ln(5/3)≈1.357.Now we state our problem of clustering transactional data below.Problem definition Given D and r, find a clustering C thatmaximize Profitr(C).Figure 3. The sketch of the CLOPE algorithm./* Phrase 1 - Initialization */1:while not end of the database file2: read the next transaction〈t,unknown〉;3: put t in an existing cluster or a new cluster Cithat maximize profit;4: write〈t, i〉back to database;/* Phrase 2 - Iteration */5:repeat6: rewind the database file;7: moved =false;8: while not end of the database file9: read〈t, i〉;10: move t to an existing cluster or new cluster Cj

that maximize profit;11: if Ci

≠Cjthen12: write〈t, j〉;13: moved =true;14:until not moved;3. IMPLEMENTATIONLike most partition-based clustering approaches, we approximatethe best solution by iterative scanning of the database. However, asour criterion function is defined globally, only with easilycomputable metrics like size and width, the execution speed ismuch faster than the local ones.Our implementation requires a first scan of the database to buildthe initial clustering, driven by the criterion function Profitr. After

1In most of the cases, r should be greater than 1. Otherwise, twotransactions sharing no common item can be put in the samecluster.that, a few more scans are required to refine the clustering andoptimize the criterion function. If no changes to the clustering aremade in a previous scan, the algorithm will stop, with the finalclustering as the output. The output is simply an integer label forevery transaction, indicating the cluster id that the transactionbelongs to. The sketch of the algorithm is shown in Figure 3.RAM data structure In the limited RAM space, we keeps onlythe current transaction and a small amount of information for eachcluster. The information, called cluster features2, includes thenumber of transactions N, the number of distinct items (or width)W, a hash of〈item, occurrence〉pairs occ, and a pre-computedinteger S for fast access of the size of cluster. We write C.occ[i] forthe occurrence of item i in cluster C, etc.Remark In fact, CLOPE is quite memory saving, even arrayrepresentation of the occurrence data is practical for mosttransactional databases. The total memory required for itemoccurrences is approximately M×K×4 bytes using array of 4-byteintegers, where M is the number of dimensions, and K the numberof clusters. Databases with up to 10k distinct items with aclustering of 1k clusters can be fit into a 40M RAM.The computation of profit It is easy to update the clusterfeature data when adding or removing a transaction. Thecomputation of profit through cluster features is alsostraightforward, using S, W, and N of every cluster. The most time-sensitive parts in the algorithm (statement 3 and 10 in Figure 3.)are the comparison of different profits of adding a transaction to allthe clusters (including an empty one). Although computing theprofit requires summing up values from all the clusters, we can usethe value change of the current cluster being tested to achieve thesame but much faster judgement.Figure 4. Computing the delta value of adding t to C.1:int DeltaAdd(C, t, r) {2: S_new = C.S + t.ItemCount;3: W_new = C.W;4:for(i = 0; i < t.ItemCount; i++)5:if(C.occ[t.items[i]] == 0) ++W_new;6:returnS_new*(C.N+1)/(W_new)r-C.S*C.N /(C.W)r;7:}

We use the function DeltaAdd(C, t, r) in Figure 4 to compute thechange of valuerWCNCSC).(..×after adding transaction t to cluster C.The following theorem guarantees the correctness of ourimplementation.Theorem If DeltaAdd(Ci, t) is the maximum, then putting t to Ci

will maximize Profitr.Proof: Observing the profit function, we find that the profits ofputting t to different clusters only differ in the numerator part ofthe formula. Assume that the numerator of the clustering profitbefore adding t is X. Subtracting the constant X from these newnumerators, we get exactly the values returned by the DeltaAddfunction.Time and space complexity From Figure 4, we know that thetime complexity of DeltaAdd is O(t.ItemCount). Suppose the

2Named after BIRCH [14].average length of a transaction is A, the total number oftransactions is N, and the maximum number of clusters is K, thetime complexity for one iteration is O(N×K×A), indicating thatthe execution speed of CLOPE is affected linearly by the numberof clusters, and the I/O cost is linear to the database size. Sinceonly one transaction is kept in memory at any time, the spacerequirement for CLOPE is approximately the memory size of thecluster features. It is linear to the number of dimensions M timesthe maximum number of clusters K. For most transactionaldatabases, it is not a heavy requirement.4. EXPERIMENTSIn this section, we analyze the effectiveness and execution speedof CLOPE with two real-life datasets. For effectiveness, wecompare the clustering quality of CLOPE on a labeled dataset(mushroom from the UCI data mining repository) with those ofLargeItem [13] and ROCK [7]. For execution speed, we compareCLOPE with LargeItem on a large web log dataset. All theexperiments in this Section are carried out on a PIII 450M Linuxmachine with 128M memory.4.1 MushroomThe mushroom dataset from the UCI machine learning repository(http://www.ics.uci.edu/~mlearn/MLRepository.html) has beenused by both ROCK and LargeItem for effectiveness tests. Itcontains 8,124 records with two classes, 4,208 edible mushroomsand 3,916 poisonous mushrooms. By treating the value of eachattributes as items of transactions, we converted all the 22categorical attributes to transactions with 116 distinct items(distinct attribute values). 2480 missing values for thestalk-root

attribute are ignored in the transactions.4000500060007000800090000.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0repulsionpurity020406080100120140no. clusterspurityno.clustersFigure 5. The result of CLOPE on mushroom.We try different repulsion value from 0.5 to 4, with a step of 0.1.A few of the results are shown in Figure 5.To make a general impression of the clustering quality, we use twometrics in the chart. The purity metric is computed by summing upthe larger one of the number of edibles and the number ofpoisonous in every cluster. It has a maximum of 8124, the totalnumber of transactions. The number of clusters should be as fewas possible, since a clustering with each transaction as a clusterwill surely achieve a maximum purity.When r=2.6, the number of clusters is 27, and there is only oneclusters with mixed records: 32 poisonous and 48 edibles(purity=8092). When r reaches 3.1, there are 30 clusters withperfect classification (purity=8124). Most of these results requireat most 3 scans of the database. The number of transactions inthese clusters varies, from 1 to 1726 when r=2.6.The above results are quite close to results presented in the ROCKpaper [7], where the only result given is 21 clusters with only oneimpure cluster with 72 poisonous and 32 edibles (purity=8092), bya support of 0.8. Consider the quadratic time and space complexityof ROCK, the results of CLOPE are quite appealing.The results of LargeItem presented in [13] on the mushroomdataset were derived hierarchically by recursive clustering ofimpure clusters, and are not comparable directly. We try ourLargeItem implementation to get the direct result. The criterionfunction of LargeItem is defined as [13]:Costθ,w(C) = w×Intra + InterHereθis the minimum support in percentage for an item to belarge in a cluster. Intra is number of distinct small (non-large)items among all clusters, and Inter the number of overlappinglarge items, which equals to the total number of large items minusthe distinct number of large items, among all clusters. A weight wis introduced to control the different importance of Intra and Inter.The LargeItem algorithm tries to minimize the cost during theiterations. In our experiment, when a default w=1 was used, nogood clustering was found with differentθfrom 0.1 to 1.0 (Figure6(a)). After analyzing the results, we found that there was always amaximum value for Intra, for all the results. We increased w tomake a larger Intra more expensive. When w reached 10, we foundpure results with 58 clusters at support 1. The result of w=10 isshown in Figure 6(b).4000500060007000800090000.1 0.3 0.5 0.7 0.9minimum supportpurity020406080100120140no. clusterspurityno.clusters(a) weight for Intra = 14000500060007000800090000.1 0.3 0.5 0.7 0.9minimum supportpurity020406080100120140no. clusterspurityno.clusters(b) weight for Intra=10Figure 6. The result of LargeItem on mushroom.Our experiment results on the mushroom dataset show that withvery simple intuition and linear complexity, CLOPE is quiteeffective. The result of CLOPE on mushroom is better than that ofLargeItem and close to that of ROCK, which has quadraticcomplexity to the number of transactions. The comparison withLargeItem also shows that the simple idea behind CLOPE worksquite well even without any explicit constraint on inter-clusterdissimilarity.Sensitivity to data order We also perform sensitivity test ofCLOPE on the order of input data using mushroom. The result inFigure 5 and 6 are all derived with the original data order. We testCLOPE with randomly ordered mushroom data. The results aredifferent but very close to the original ones, with a best result ofreaching purity=8124 with 28 clusters, at r=2.9, and a worst resultof reaching purity=8124 with 45 clusters, at r=3.9. It shows thatCLOPE is not very sensitive to the order of input data. However,our experiment results on randomly ordered mushroom data showthat LargeItem is more sensitive to data order than CLOPE.

4.2 Berkeley web logsApart from market basket data, web log data is another typicalcategory of transactional databases. We choose the web log filesfrom http://www.cs.berkeley.edu/logs/ as the dataset for oursecond experiment and test the scalability as well as performanceof CLOPE. We use the web logs of November 2001 andpreprocess it with methods proposed in [3]. There are about 7million entries in the raw log file and 2 million of them are keptafter non-html3entries removed. Among these 2 million entries,there are a total of 93,665 distinct pages. The only available clientIP field is used for user identification. With a session idle time of15 minutes, 613,555 sessions are identified. The average sessionlength is 3.34.For scalability test, we set the maximum number of clusters to 100and run CLOPE (r=1.0, 1.5, 2.0) and LargeItem (θ=0.2, 0.6, and 1,with w=1) on 10%, 50% and 100% of the sessions respectively.The average per-iteration running time is shown in Figure 7.0200400600800100010% 50% 100%percentage of the input data(total = 613,555 sessions)execution time (seconds)LargeItemMinSupp=0.2LargeItemMinSupp=0.6LargeItemMinSupp=1.0CLOPE r=1.0CLOPE r=1.5CLOPE r=2.0

Figure 7. The running time of CLOPE and LargeItem on theBerkeley web log data.From Figure 7, we can see that the execution time of both CLOPEand LargeItem are linear to the database size. For non-integerrepulsion values, CLOPE runs slower for the float point

3Those non-directory requests having extensions other than“.[s]htm[l]”.computational overhead. All these results reach the maximumnumber of clusters allowed, except CLOPE with r=1, in whichonly 30 clusters were found for the whole session file. That’s thereason for a very fast speed of less than 1 minute per iteration forthe whole dataset. The execution time of LargeItem is roughly 3-5times as that of CLOPE, while LargeItem uses about 2 times thememory of CLOPE for the cluster feature data.To have some impression on the effectiveness of CLOPE on noisydata, we run CLOPE on the November session data with r=1.5 anda maximum number of 1,000 clusters. The resulting clusters areordered by the number of transactions they contain. Table 1 showsthe largest cluster (C1000) with 20,089 transactions and other twohigh quality clusters found by CLOPE. These three clusters arequite good, but in many of the other clusters, pages from differentpaths are grouped together. Some of these may actually revealsome common visiting patterns, while others may due to noisesinherited in the web log. However, our results of the LargeItemalgorithm are not very satisfying.

Table 1. Some clusters of CLOPE on the log data (r=1.5).C781: N=554, W=6, S=1083/~lazzaro/sa/book/simple/index.html, occ=426/~lazzaro/sa/index.html, occ=332/~lazzaro/sa, occ=170/~lazzaro/sa/book/index.html, occ=120/~lazzaro/sa/video/index.html, occ=26/~lazzaro/sa/sfman/user/network/index.html, occ=9C815: N=619, W=6, S=1172/~russell/aima.html, occ=388/~russell/code/doc/install.html, occ=231/~russell/code/doc/overview.html, occ=184/~russell/code/doc/user.html, occ=158/~russell/intro.html, occ=150/~russell/aima-bib.html, occ=61C1000: N=20089, W=2, S=22243/, occ=19517/Students/Classes, occ=2726* number after page name is the occurrence in the cluster5. RELATED WORKThere are many works on clustering large databases, e.g.CLARANS [12], BIRCH [14], DBSCAN [4], CLIQUE [1]. Mostof them are designed for low dimensional numerical data,exceptions are CLIQUE which finds dense subspaces in higherdimensions.Recently, many works on clustering large categorical databasesbegan to appear. The k-modes [10] approach represents a cluster ofcategorical value with the vector that has the minimal distance toall the points. The distance in k-modes is measured by number ofcommon categorical attributes shared by two points, with optionalweights among different attribute values. Han et.al. [8] useassociation rule hypergraph partitioning to cluster items in largetransactional database. STIRR [6] and CACTUS [5] also modelcategorical clustering as a hypergraph-partitioning problem, butthese approaches are more suitable for database made up of tuples.ROCK [7] uses the number of common neighbors between twotransactions for similarity measure, but the computational cost isheavy, and sampling has to be used when scaling to large dataset.The most similar work to CLOPE is LargeItem [13]. However, ourexperiments show that CLOPE is able to find better clusters, evenat a faster speed. Moreover, CLOPE requires only one parameter,repulsion, which gives the user much control over the approximatenumber of the resulting clusters, with little domain knowledge.The minimal supportθand the weight w of LargeItem are moredifficult to determine. Our sensitivity tests of these two algorithmsalso show that CLOPE is less sensitive than LargeItem to the orderof the input data.Moreover, many works on document clustering are quite relatedwith transactional data clustering. In document clustering, eachdocument is represented as a weighted vector of words in it.Clustering is carried out also by optimizing a certain criterionfunction. However, document clustering tends to assume differentweights on words with respect to their frequencies. See [15] forsome common approaches in document clustering.Also, there are some similarities between transactional dataclustering and association analysis [2]. Both of these two populardata mining techniques can reveal some interesting properties ofitem co-occurrence and relationship in transactional databases.Moreover, current approaches [9] for association analysis needsonly very few scans of the database. However, there aredifferences. On the one hand, clustering can give a generaloverview property of the data, while association analysis onlyfinds the strongest item co-occurrence pattern. On the other hand,association rules are actionable directly, while clustering for largetransactional data is not enough, and are mostly used aspreprocessing phrase for other data mining tasks like associationanalysis.6. CONCLUSIONIn this paper, a novel algorithm for categorical data clusteringcalled CLOPE is proposed based on the intuitive idea of increasingthe height-to-width ratio of the cluster histogram. The idea isgeneralized with a repulsion parameter that controls tightness oftransactions in a cluster, and thus the resulting number of clusters.The simple idea behind CLOPE makes it fast, scalable, andmemory saving in clustering large, sparse transactional databaseswith high dimensions. Our experiments show that CLOPE is quiteeffective in finding interesting clusterings, even though it doesn’tspecify explicitly any inter-cluster dissimilarity metric. Moreover,CLOPE is not very sensitive to data order, and requires littledomain knowledge in controlling the number of clusters. Thesefeatures make CLOPE a good clustering as well as preprocessingalgorithm in mining transactional data like market basket data andweb usage data.7. ACKNOWLEDGEMENTSWe are grateful to Rajeev Rastogi, Vipin Kumar for providing usthe ROCK code and the technical report version of [7]. We wish tothank the providers of the UCI ML Repository and Web log filesofhttp://www.cs.berkeley.edu/. We also wish to thank theauthors of [13] for their help. Comments from the threeanonymous referees are invaluable for us to prepare the finalversion.